Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae

ABSTRACT

The invention relates to the identification of a new adhesin islands within the genomes of several Group A and Group B  Streptococcus  serotypes and isolates. The adhesin islands are thought to encode surface proteins which are important in the bacteria&#39;s virulence. Thus, the adhesin island proteins of the invention may be used in immunogenic compositions for prophylactic or therapeutic immunization against GAS or GBS infection. For example, the invention may include an immunogenic composition comprising one or more of the discovered adhesin island proteins.

FIELD OF THE INVENTION

The invention relates to the identification of adhesin islands withinthe genome Streptococcus agalactiae (“GBS”) and the use of adhesinisland amino acid sequences encoded by these adhesin islands incompositions for the treatment or prevention of GBS infection. Similarsequences have been identified in other Gram positive bacteria. Theinvention further includes immunogenic compositions comprising adhesinisland amino acid sequences of Gram positive bacteria for the treatmentor prevention of infection of Gram positive bacteria. Preferredimmunogenic compositions of the invention include an adhesin islandsurface protein which may be formulated or purified in an oligomeric orpilus form.

BACKGROUND OF THE INVENTION

GBS has emerged in the last 20 years as the major cause of neonatalsepsis and meningitis that affects 0.5-3 per 1000 live births, and animportant cause of morbidity among older age groups affecting 5-8 per100,000 of the population. Current disease management strategies rely onintrapartum antibiotics and neonatal monitoring which have reducedneonatal case mortality from >50% in the 1970's to less than 10% in the1990's. Nevertheless, there is still considerable morbidity andmortality and the management is expensive. 15-35% of pregnant women areasymptomatic carriers and at high risk of transmitting the disease totheir babies. Risk of neonatal infection is associated with low serotypespecific maternal antibodies and high titers are believed to beprotective. In addition, invasive GBS disease is increasingly recognizedin elderly adults with underlying disease such as diabetes and cancer.

The “B” in “GBS” refers to the Lancefield classification, which is basedon the antigenicity of a carbohydrate which is soluble in dilute acidand called the C carbohydrate. Lancefield identified 13 types of Ccarbohydrate, designated A to 0, that could be serologicallydifferentiated. The organisms that most commonly infect humans are foundin groups A, B, D, and G. Within group B, strains can be divided into atleast 9 serotypes (Ia, Ib, Ia/c, II, III, IV, V, VI, VII and VIII) basedon the structure of their polysaccharide capsule. In the past, serotypesIa, Ib, II, and III were equally prevalent in normal vaginal carriageand early onset sepsis in newborns. Type V GBS has emerged as animportant cause of GBS infection in the USA, however, and strains oftypes VI and VIII have become prevalent among Japanese women.

The genome sequence of a serotype V strain 2603 V/R has been published(See Tettelin et al. (2002) Proc. Natl. Acad. Sci. USA,10.1073/pnas.182380799) and various polypeptides for use a vaccineantigens have been identified (WO 02/34771). The vaccines currently inclinical trials, however, are based primarily on polysaccharideantigens. These suffer from serotype-specificity and poorimmunogenicity, and so there is a need for effective vaccines against S.agalactiae infection.

S. agalactiae is classified as a gram positive bacterium, a collectionof about 21 genera of bacteria that colonize humans, have a generallyspherical shape, a positive Gram stain reaction and lack endospores.Gram positive bacteria are frequent human pathogens and includeStaphylococcus (such as S. aureus), Streptococcus (such as S. pyogenes(GBS), S. pyogenes (GAS), S. pneumonaie, S. mutans), Enterococcus (suchas E. faecalis and E. faecium), Clostridium (such as C. difficile),Listeria (such as L. monocytogenes) and Corynebacterium (such as C.diphtheria).

It is an object of the invention to provide further and improvedcompositions for providing immunity against disease and/or infection ofGram positive bacteria. The compositions are based on the identificationof adhesin islands within Streptococcal genomes and the use of aminoacid sequences encoded by these islands in therapeutic or prophylacticcompositions. The invention further includes compositions comprisingimmunogenic adhesin island proteins within other Gram positive bacteriain therapeutic or prophylactic compositions.

SUMMARY OF THE INVENTION

Applicants have identified a new adhesin island, “GBS Adhesin Island 1”,“AI-1” or “GBS AI-1”, within the genomes of several Group BStreptococcus serotypes and isolates. This adhesin island is thought toencode surface proteins which are important in the bacteria's virulence.In addition, Applicants have discovered that surface proteins within GBSAdhesin Islands form a previously unseen pilus structure on the surfaceof GBS bacteria. Amino acid sequences encoded by such GBS AdhesinIslands may be used in immunogenic compositions for the treatment orprevention of GBS infection.

A preferred immunogenic composition of the invention comprises an AI-1surface protein, such as GBS 80, which may be formulated or purified inan oligomeric (pilus) form. In a preferred embodiment, the oligomericform is a hyperoligomer. Electron micrographs depicting some of thefirst visualizations of this pilus structure in a wild type GBS strainare shown in FIGS. 16, 17, 49, and 50. In addition, Applicants havetransformed a GBS strain with a plasmid comprising the AI surfaceprotein GBS 80 which resulted in increased production of that AI surfaceprotein. The electron micrographs of this mutant GBS strain in FIGS.13-15 reveal long, hyper-oligomeric structures comprising GBS 80 whichappear to cover portions of the surface of the bacteria and stretch farout into the supernatant. These hyper-oligomeric pilus structurescomprising a GBS AI surface protein may be purified or otherwiseformulated for use in immunogenic compositions.

GBS AI-1 comprises a series of approximately five open reading framesencoding for a collection of amino acid sequences comprising surfaceproteins and sortases (“AI-1 proteins”). Specifically, AI-1 includespolynucleotide sequences encoding for two or more of GBS 80, GBS 104,GBS 52, SAG0647 and SAG0648. One or more of the AI-1 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the AI-1open reading frames may be replaced by a sequence having sequencehomology (sequence identity) to the replaced ORF.

AI-1 typically resides on an approximately 16.1 kb transposon-likeelement frequently inserted into the open reading frame for trmA. One ormore of the AI-1 surface protein sequences typically include an LPXTGmotif (such as LPXTG (SEQ ID NO: 122)) or other sortase substrate motif.The AI surface proteins of the invention may affect the ability of theGBS bacteria to adhere to and invade epithelial cells. AI surfaceproteins may also affect the ability of GBS to translocate through anepithelial cell layer. Preferably, one or more AI surface proteins arecapable of binding to or otherwise associating with an epithelial cellsurface. AI surface proteins may also be able to bind to or associatewith fibrinogen, fibronectin, or collagen.

The sortase proteins are thought to be involved in the secretion andanchoring of the LPXTG containing surface proteins. AI-1 may encode atleast one surface protein. Alternatively, AI-1 may encode at least twosurface proteins and at least one sortase. Preferably, AI-1 encodes forat least three surface proteins and at least two sortases. One or moreof the surface proteins may include an LPXTG motif or other sortasesubstrate motif.

The GBS AI-1 protein of the composition may be selected from the groupconsisting of GBS 80, GBS 104, GBS 52, SAG0647 and SAG0648. GBS AI-1surface proteins GBS 80 and GBS 104 are preferred for use in theimmunogenic compositions of the invention.

In addition to the open reading frames encoding the AI-1 proteins, AI-1may also include a divergently transcribed transcriptional regulatorsuch as araC (i.e., the transcriptional regulator is located near oradjacent to the AI protein open reading frames, but it transcribed inthe opposite direction). It is believed that araC may regulate theexpression of the GBS AI operon. (See Korbel et al., NatureBiotechnology (2004) 22(7): 911-917 for a discussion of divergentlytranscribed regulators in E. coli).

A second adhesin island, “Adhesin Island-2”, “AI-2” or “GBS AI-2”, hasalso been identified in numerous GBS serotypes. Amino acid sequencesencoded by the open reading frames of AI-2 may also be used inimmunogenic compositions for the treatment or prevention of GBSinfection.

GBS AI-2 comprises a series of approximately five open reading framesencoding for a collection of amino acid sequences comprising surfaceproteins and sortases. Specifically, AI-2 includes open reading framesencoding for two or more of GBS 67, GBS 59, GBS 150, SAG1405, SAG1406,01520, 01521, 01522, 01523, 01523, 01524 and 01525. The GBS AI-2sequences may be divided into two subgroups. In one embodiment, AI-2includes open reading frames encoding for two or more of GBS 67, GBS 59,GBS 150, SAG1405, and SAG1406. This collection of open reading framesmay be generally referred to as GBS AI-2 subgroup 1. Alternatively, AI-2may include open reading frames encoding for two or more of 01520,01521, 01522, 01523, 01523, 01524 and 01525. This collection of openreading frames may be generally referred to as GBS AI-2 subgroup 2.

One or more of the AI-2 open reading frame polynucleotide sequences maybe replaced by a polynucleotide sequence coding for a fragment of thereplaced ORF. Alternatively, one or more of the AI-2 open reading framesmay be replaced by a sequence having sequence homology (sequenceidentity) to the replaced ORF.

One or more of the AI-2 surface proteins typically include an LPXTGmotif (such as LPXTG (SEQ ID NO: 122)) or other sortase substrate motif.These sortase proteins are thought to be involved in the secretion andanchoring of the LPXTG containing surface proteins. AI-2 may encode forat least one surface protein. Alternatively, AI-2 may encode for atleast two surface proteins and at least one sortase. Preferably, AI-2encodes for at least three surface proteins and at least two sortases.One or more of the surface proteins may include an LPXTG motif.

The AI-2 protein of the composition may be selected from the groupconsisting of GBS 67, GBS 59, GBS 150, SAG1405, SAG1406, 01520, 01521,01522, 01523, 01523, 01524 and 01525. AI-2 surface proteins GBS 67, GBS59, and 01524 are preferred AI-2 proteins for use in the immunogeniccompositions of the invention. GBS 67 or GBS 59 is particularlypreferred.

GBS AI-2 may also include a divergently transcribed transcriptionalregulator such as a RofA like protein (for example rogB). As in AI-1,rogB is thought to regulate the expression of the AI-2 operon.

The GBS AI proteins of the invention may be used in immunogeniccompositions for prophylactic or therapeutic immunization against GBSinfection. For example, the invention may include an immunogeniccomposition comprising one or more GBS AI-1 proteins and one or more GBSAI-2 proteins.

The immunogenic compositions may also be selected to provide protectionagainst an increased range of GBS serotypes and strain isolates. Forexample, the immunogenic composition may comprise a first and second GBSAI protein, wherein a full length polynucleotide sequence encoding forthe first GBS AI protein is not present in a genome comprising a fulllength polynucleotide sequence encoding for the second GBS AI protein.In addition, each antigen selected for use in the immunogeniccompositions will preferably be present in the genomes of multiple GBSserotypes and strain isolates. Preferably, each antigen is presnt in thegenomes of at least two (i.e., 3, 4, 5, 6, 7, 8, 9, 10, or more) GBSstrain isolates. More preferably, each antigen is present in the genomesof at least two (i.e., at least 3, 4, 5 or more) GBS serotypes.

Within GBS AI-1, Applicants have found that Group B Streptococcussurface exposure of GBS 104 is dependent on the concurrent expression ofGBS 80. It is thought that GBS 80 is involved in the transport orlocalization of GBS 104 to the surface of the bacteria. The two proteinsmay be oligomerized or otherwise chemically or physically associated. Itis possible that this association involves a conformational change inGBS 104 that facilitates its transition to the surface of the GBSbacteria. In addition, one or more AI sortases may also be involved inthis surface localization and chemical or physical association. Similarrelationships are thought to exist within GBS AI-2. The compositions ofthe invention may therefore include at least two AI proteins, whereinthe two AI proteins are physically or chemically associated. Preferably,the two AI proteins form an oligomer. Preferably, one or more of the AIproteins are in a hyper-oligomeric form. In one embodiment, theassociated AI proteins may be purified or isolated from a GBS bacteriaor recombinant host cell.

It is also an object of the invention to provide further and improvedcompositions for providing prophylactic or therapeutic protectionagainst disease and/or infection of Gram positive bacteria. Thecompositions are based on the identification of adhesin islands withinStreptococcal genomes and the use of amino acid sequences encoded bythese islands in therapeutic or prophylactic compositions. The inventionfurther includes compositions comprising immunogenic adhesin islandproteins within other Gram positive bacteria in therapeutic orprophylactic compositions. Preferred Gram positive adhesin islandproteins for use in the invention may be derived from Staphylococcus(such as S. aureus), Streptococcus (such as S. agalactiae (GBS), S.pyogenes (GAS), S. pneumonaie, S. mutans), Enterococcus (such as E.faecalis and E. faecium), Clostridium (such as C. difficile), Listeria(such as L. monocytogenes) and Corynebacterium (such as C. diphtheria).Preferably, the Gram positive adhesin island surface proteins are inoligomeric or hyperologimeric form.

For example, Applicants have identified adhesin islands within thegenomes of several Group A Streptococcus serotypes and isolates. Theseadhesion islands are thought to encode surface proteins which areimportant in the bacteria's virulence, and Applicants have obtained thefirst electron micrographs revealing the presence of these adhesinisland proteins in hyperoligomeric pilus structures on the surface ofGroup A Streptococcus.

Group A Streptococcus is a human specific pathogen which causes a widevariety of diseases ranging from pharyngitis and impetigo through lifethreatening invasive disease and necrotizing fasciitis. In addition,post-streptococcal autoimmune responses are still a major cause ofcardiac pathology in children.

Group A Streptococcal infection of its human host can generally occur inthree phases. The first phase involves attachment and/or invasion of thebacteria into host tissue and multiplication of the bacteria within theextracellular spaces. Generally this attachment phase begins in thethroat or the skin. The deeper the tissue level infected, the moresevere the damage that can be caused. In the second stage of infection,the bacteria secretes a soluble toxin that diffuses into the surroundingtissue or even systemically through the vasculature. This toxin binds tosusceptible host cell receptors and triggers innappropropriate immuneresponses by these host cells, resulting in pathology. Because the toxincan diffuse throughout the host, the necrosis directly caused by the GAStoxins may be physically located in sites distant from the bacterialinfection. The final phase of GAS infection can occur long after theoriginal bacteria have been cleared from the host system. At this stage,the host's previous immune response to the GAS bacteria due to crossreactivity between epitopes of a GAS surface protein, M, and hosttissues, such as the heart. A general review of GAS infection can befound in Principles of Bacterial Pathogeneis, Groisman ed., Chapter 15(2001).

In order to prevent the pathogenic effects associated with the laterstages of GAS infection, an effective vaccine against GAS willpreferably facilitate host elimination of the bacteria during theinitial attachment and invasion stage.

Isolates of Group A Streptococcus are historically classified accordingto the M surface protein described above. The M protein is surfaceexposed trypsin-sensitive protein generally comprising two polypeptidechains complexed in an alpha helical formation. The carboxyl terminus isanchored in the cytoplasmic membrane and is highly conserved among allgroup A streptococci. The amino terminus, which extend through the cellwall to the cell surface, is responsible for the antigenic variabilityobserved among the 80 or more serotypes of M proteins.

A second layer of classification is based on a variable,trypsin-resistant surface antigen, commonly referred to as theT-antigen. Decades of epidemiology based on M and T serological typinghave been central to studies on the biological diversity and diseasecausing potential of Group A Streptococci. While the M-protein componentand its inherent variability have been extensively characterized, evenafter five decades of study, there is still very little known about thestructure and variability of T-antigens. Antisera to define T types iscommercially available from several sources, including Sevapharma(http://www.sevapharma.cz/en).

The gene coding for one form of T-antigen, T-type 6, from an M6 strainof GAS (D741) has been cloned and characterized and maps to anapproximately 11 kb highly variable pathogenicity island. Schneewind etal., J. Bacteriol. (1990) 172(6):3310-3317. This island is known as theFibronectin-binding, Collagen-binding T-antigen (FCT) region because itcontains, in addition to the T6 coding gene (tee6), members of a familyof genes coding for Extra Cellular Matrix (ECM) binding proteins. Bessenet al., Infection & Immunity (2002) 70(3):1159-1167. Several of theprotein products of this gene family have been shown to directly bindeither fibronectin and/or collagen. See Hanski et al., Infection &Immunity (1992) 60(12):5119-5125; Talay et al., Infection & Immunity(1992(60(9):3837-3844; Jaffe et al. (1996) 21(2):373-384; Rocha et al.,Adv Exp Med Biol. (1997) 418:737-739; Kreikemeyer et al., J Biol Chem(2004) 279(16):15850-15859; Podbielski et al., Mol. Microbiol. (1999)31(4):1051-64; and Kreikemeyer et al., Int. J. Med Microbiol (2004)294(2-3):177-88. In some cases direct evidence for a role of theseproteins in adhesion and invasion has been obtained.

Applicants raised antiserum against a recombinant product of the tee6gene and used it to explore the expression of T6 in M6 strain 2724. Inimmunoblot of mutanolysin extracts of this strain, the antiserumrecognized, in addition to a band corresponding to the predictedmolecular mass of the product, very high molecular weight laddersranging in mobility from about 100 kDa to beyond the resolution of the3-8% gradient gels used.

This pattern of high molecular weight products is similar to thatobserved in immunoblots of the protein components of the pili identifiedin Streptococcus agalactiae (described above) and previously inCorynebacterium diphtheriae. Electron microscropy of strain M6_(—)2724with antisera specific for the product of tee6 revealed abundant surfacestaining and long pilus like structures extending up to 700 nanometersfrom the bacterial surface, revealing that the T6 protein, one of theantigens recognized in the original Lancefiled serotyping system, islocated within a GAS Adhesin Island (GAS AI-1) and forms long covalentlylinked pilus structures.

Applicants have identified at least four different Group A StreptococcusAdhesin Islands. While these GAS AI sequences can be identified innumerous M types, Applicants have surprisingly discovered a correlationbetween the four main pilus subunits from the four different GAS AItypes and specific T classifications. While other trypsin-resistantsurface exposed proteins are likely also implicated in the Tclassification designations, the discovery of the role of the GASadhesin islands (and the associated hyper-oligomeric pilus likestructures) in T classification and GAS serotype variance has importantimplications for prevention and treatment of GAS infections. Applicantshave identified protein components within each of the GAS adhesinislands which are associated with the pilus formation. These proteinsare believed to be involved in the bacteria's initial adherencemechanisms. Immunological recognition of these proteins may allow thehost immune response to slow or prevent the bacteria's transition intothe more pathogenic later stages of infection.

In addition, Applicants have discovered that the GBS pili structuresappear to be implicated in the formation of biofilms (populations ofbacteria growing on a surface, often enclosed in an exopolysaccharidematrix). Biofilms are generally associated with bacterial resistance, asantibiotic treatments and host immune response are frequently unable toerradicate all of the bacteria components of the biofilm. Direction of ahost immune response against surface proteins exposed during the firststeps of bacterial attachment (i.e., before complete biofilm formation)is preferable.

The invention therefore provides for improved immunogenic compositionsagainst GAS infection which may target GAS bacteria during their initialattachment efforts to the host epithelial cells and may provideprotection against a wide range of GAS serotypes. The immunogeniccompositions of the invention include GAS AI surface proteins which maybe formulated in an oligomeric, or hyperoligomeric (pilus) form. Theimmunogenic compositions of the invention may include one or more GAS AIsurface proteins. The invention also includes combinations of GAS AIsurface proteins. Combinations of GAS AI surface proteins may beselected from the same adhesin island or they may be selected fromdifferent GAS adhesin islands.

Amino acid sequence encoded by such GAS Adhesin Islands may be used inimmunogenic compositions for the treatment or prevention of GASinfection. Preferred immunogenic compositions of the invention comprisea GAS AI surface protein which has been formulated or purified in anoligomeric (pilus) form. In a preferred embodiment, the oligomeric formis a hyperoligomer.

GAS Adhesin Islands generally include a series of open reading frameswithin a GAS genome that encode for a collection of surface proteins andsortases. A GAS Adhesin Island may encode for an amino acid sequencecomprising at least one surface protein. The Adhesin Island, therefore,may encode at least one surface protein. Alternatively, a GAS AdhesinIsland may encode for at least two surface proteins and at least onesortase. Preferably, a GAS Adhesin Island encodes for at least threesurface proteins and at least two sortases. One or more of the surfaceproteins may include an LPXTG motif (such as LPXTG (SEQ ID NO: 122)) orother sortase substrate motif. One or more GAS AI surface proteins mayparticipate in the formation of a pilus structure on the surface of theGram positive bacteria.

GAS Adhesin Islands of the invention preferably include a divergentlytranscribed transcriptional regulator. The transcriptional regulator mayregulate the expression of the GAS AI operon. Examples oftranscriptional regulators found in GAS AI sequences include RofA andNra.

The GAS AI surface proteins may bind or otherwise adhere to fibrinogen,fibronectin, or collagen. One or more of the GAS AI surface proteins maycomprise a fimbrial structural subunit.

One or more of the GAS AI surface proteins may include an LPXTG motif orother sortase substrate motif. The LPXTG motif may be followed by ahydrophobic region and a charged C terminus, which are thought to retardthe protein in the cell membrane to facilitate recognition by themembrane-localized sortase. See Barnett, et al., J. Bacteriology (2004)186 (17): 5865-5875.

GAS AI sequences may be generally categorized as Type 1, Type 2, Type 3,or Type 4, depending on the number and type of sortase sequences withinthe island and the percentage identity of other proteins (with theexception of RofA and cpa) within the island. Schematics of the GASadhesin islands are set forth in FIG. 51A and FIG. 162. “GAS AdhesinIsland-1 or “GAS AI-1” comprises a series of approximately five openreading frames encoding for a collection of amino acid sequencescomprising surface proteins and sortases (“GAS AI-1 proteins”). GAS AI-1preferably comprises surface proteins, a srtB sortase and a rofAdivergently transcribed transcriptional regulator. GAS AI-1 surfaceproteins may include a fibronectin binding protein, a collagen adhesionprotein and a fimbrial structural subunit. The fimbrial structuralsubunit (also known as tee6) is thought to form the shaft portion of thepilus like structure, while the collagen adhesion protein (Cpa) isthought to act as an accessory protein facilitating the formation of thepilus structure, exposed on the surface of the bacterial capsule.

Specifically, GAS AI-1 includes polynucleotide sequences encoding fortwo or more of M6_Spy0157, M6_Spy0158, M6_Spy0159, M6_Spy0160,M6_Spy0161. The GAS AI-1 may also include polynucleotide sequencesencoding for any one of CDC SS 410_fimbrial, ISS3650_fimbrial,DSM2071_fimbrial

A preferred immunogenic composition of the invention comprises a GASAI-1 surface protein which may be formulated or purified in anoligomeric (pilus) form. In a preferred embodiment, the oligomeric formis a hyperoligomer. The immunogenic composition of the invention mayalternatively comprise an isolated GAS AI-1 surface protein inoligomeric (pilus) form. The oligomer or hyperoligomeric pilusstructures comprising GAS AI-1 surface proteins may be purified orotherwise formulated for use in immunogenic compositions.

One or more of the GAS AI-1 polynucleotide sequences may be replaced bya polynucleotide sequence coding for a fragment of the replaced ORF.Alternatively, one or more of the GAS AI-1 open reading frames may bereplaced by a sequence having sequence homology (sequence identity) tothe replaced ORF.

One or more of the GAS AI-1 surface proteins typically include an LPXTGmotif (such as LPXTG (SEQ ID NO: 122)) or other sortase substrate motif.These sortase proteins are thought to be involved in the secretion andanchoring of the LPXTG containing surface proteins. GAS AI-1 may encodefor at least one surface protein. Alternatively, GAS AI-1 may encode forat least two surface proteins and at least one sortase. Preferably, GASAI-1 encodes for at least three surface proteins and at least twosortases. One or more of the surface proteins may include an LPXTGmotif.

GAS AI-1 preferably includes a srtB sortase. GAS srtB sortases maypreferably anchor surface proteins with an LPSTG motif (SEQ ID NO: 166),particularly where the motif is followed by a serine.

The GAS AI-1 protein of the composition may be selected from the groupconsisting of M6_Spy0157, M6_Spy0158, M6_Spy0159, M6_Spy0160 M6_Spy0161,CDC SS 410_fimbrial, ISS3650_fimbrial, and DSM2071_fimbrial. GAS AI-1surface proteins M6_Spy0157 (a fibronectin binding protein), M6_Spy0159(a collagen adhesion protein, Cpa), M6_Spy0160 (a fimbrial structuralsubunit, tee6), CDC SS 410_fimbrial (a fimbrial structural subunit),ISS3650_fimbrial (a fimbrial structural subunit), and DSM2071_fimbrial(a fimbrial structural subunit) are preferred GAS AI-1 proteins for usein the immunogenic compositions of the invention. The fimbrialstructural subunit tee6 and the collagen adhesion protein Cpa arepreferred GAS AI-1 surface proteins. Preferably, each of these GAS AI-1surface proteins includes an LPXTG sortase substrate motif, such asLPXTG (SEQ ID NO: 122) or LPXSG (SEQ ID NO: 134) (conservativereplacement of threonine with serine).

In addition to the open reading frames encoding the GAS AI-1 proteins,GAS AI-1 may also include a divergently transcribed transcriptionalregulator such as rofA (i.e., the transcriptional regulator is locatednear or adjacent to the GAS AI protein open reading frames, but ittranscribed in the opposite direction).

The GAS AI-1 surface proteins may be used alone, in combination withother GAS AI-1 surface proteins or in combination with other GAS AIsurface proteins. Preferably, the immunogenic compositions of theinvention include the GAS AI-1 fimbrial structural subunit (tee6) andthe GAS AI-1 collagen binding protein. Still more preferably, theimmunogenic compositions of the invention include the GAS AI-1 fimbrialstructural subunit (tee6).

A second GAS adhesion island, “GAS Adhesin Island-2” or “GAS AI-2,” hasalso been identified in GAS serotypes. Amino acid sequences encoded bythe open reading frames of GAS AI-2 may also be used in immunogeniccompositions for the treatment or prevention of GAS infection.

A preferred immunogenic composition of the invention comprises a GASAI-2 surface protein which may be formulated or purified in anoligomeric (pilus) form. In a preferred embodiment, the oligomeric formis a hyperoligomer. A preferred immunogenic composition of the inventionalternatively comprises an isolated GAS AI-2 surface protein inoligomeric (pilus) form. The oligomer or hyperoligomeric pilusstructures comprising GAS AI-2 surface proteins may be purified orotherwise formulated for use in immunogenic compositions.

GAS AI-2 comprises a series of approximately eight open reading framesencoding for a collection of amino acid sequences comprising surfaceproteins and sortases (“GAS AI-2 proteins”). GAS AI-2 preferablycomprises surface proteins, a srtB sortase, a srtC1 sortase and a rofAdivergently transcribed transcriptional regulator.

Specifically, GAS AI-2 includes polynucleotide sequences encoding fortwo or more of GAS15, Spy0127, GAS16, GAS17, GAS18, Spy0131, Spy0133,and GAS20.

One or more of the GAS AI-2 polynucleotide sequences may be replaced bya polynucleotide sequence coding for a fragment of the replaced ORF.Alternatively, one or more of the GAS AI-2 open reading frames may bereplaced by a sequence having sequence homology (sequence identity) tothe replaced ORF.

One or more of the GAS AI-2 surface proteins typically include an LPXTGmotif (such as LPXTG (SEQ ID NO: 122)) or other sortase substrate motif.These sortase proteins are thought to be involved in the secretion andanchoring of the LPXTG containing surface proteins. GAS AI-2 may encodefor at least one surface protein. Alternatively, GAS AI-2 may encode forat least two surface proteins and at least one sortase. Preferably, GASAI-2 encodes for at least three surface proteins and at least twosortases. One or more of the surface proteins may include an LPXTGmotif.

GAS AI-2 preferably includes a srtB sortase and a srtC1 sortase. Asdiscussed above, GAS srtB sortases may preferably anchor surfaceproteins with an LPSTG motif (SEQ ID NO: 166), particularly where themotif is followed by a serine. GAS srtC1 sortase may preferentiallyanchor surface proteins with a V(P/V)PTG (SEQ ID NO: 167) motif. GASsrtC1 may be differentially regulated by rofA.

The GAS AI-2 protein of the composition may be selected from the groupconsisting of GAS15, Spy0127, GAS16, GAS17, GAS18, Spy0131, Spy0133, andGAS20. GAS AI-2 surface proteins GAS15 (Cpa), GAS16 (thought to be afimbrial protein, M1_(—)128), GAS18 (M1_Spy0130), and GAS20 arepreferred for use in the immunogenic compositions of the invention. GAS16 is thought to form the shaft portion of the pilus like structure,while GAS 15 (the collagen adhesion protein Cpa) and GAS 18 are thoughtto act as accessory proteins facilitating the formation of the pilusstructure, exposed on the surface of the bacterial capsule. Preferably,each of these GAS AI-2 surface proteins includes an LPXTG sortasesubstrate motif, such as LPXTG (SEQ ID NO: 122), VVXTG (SEQ ID NO: 135),or EVXTG (SEQ ID NO: 136).

In addition to the open reading frames encoding the GAS AI-2 proteins,GAS AI-2 may also include a divergently transcribed transcriptionalregulator such as rofA (i.e., the transcriptional regulator is locatednear or adjacent to the GAS AI protein open reading frames, but ittranscribed in the opposite direction). The GAS AI-2 surface proteinsmay be used alone, in combination with other GAS AI-2 surface proteinsor in combination with other GAS AI surface proteins. Preferably, theimmunogenic compositions of the invention include the GAS AI-2 fimbrialprotein (GAS 16), the GAS AI-2 collagen binding protein (GAS 15) and GAS18 (M1_Spy0130). More preferably, the immunogenic compositions of theinvention include the GAS AI-2 fimbrial protein (GAS 16).

A third GAS adhesion island, “GAS Adhesin Island-3” or “GAS AI-3,” hasalso been identified in numerous GAS serotypes. Amino acid sequencesencoded by the open reading frames of GAS AI-3 may also be used inimmunogenic compositions for the treatment or prevention of GASinfection.

A preferred immunogenic composition of the invention comprises a GASAI-3 surface protein which may be formulated or purified in anoligomeric (pilus) form. In a preferred embodiment, the oligomeric formis a hyperoligomer. A preferred immunogenic composition of the inventionalternatively comprises an isolated GAS AI-3 surface protein inoligomeric (pilus) form. The oligomer or hyperoligomeric pilusstructures comprising GAS AI-3 surface proteins may be purified orotherwise formulated for use in immunogenic compositions. GAS AI-3comprises a series of approximately seven open reading frames encodingfor a collection of amino acid sequences comprising surface proteins andsortases (“GAS AI-3 proteins”). GAS AI-3 preferably comprises surfaceproteins, a srtC2 sortase, and a Negative transcriptional regulator(Nra) divergently transcribed transcriptional regulator. GAS AI-3surface proteins may include a collagen binding protein, a fimbrialprotein, and a F2 like fibronectin-binding protein. GAS AI-3 surfaceproteins may also include a hypothetical surface protein. The fimbrialprotein is thought to form the shaft portion of the pilus likestructure, while the collagen adhesion protein (Cpa) and thehypothetical surface protein are thought to act as accessory proteinsfacilitating the formation of the pilus structure, exposed on thesurface of the bacterial capsule. Preferred AI-3 surface proteinsinclude the fimbrial proein, the collagen binding protein and thehypothetical protein. Preferably, each of these GAS AI-3 surfaceproteins include an LPXTG sortase substrate motif, such as LPXTG (SEQ IDNO: 122), VPXTG (SEQ ID NO: 137), QVXTG (SEQ ID NO: 138) or LPXAG (SEQID NO: 139).

Specifically, GAS AI-3 includes polynucleotide sequences encoding fortwo or more of SpyM3_(—)0098, SpyM3_(—)0099, SpyM3_(—)0100,SpyM3_(—)0101, SpyM3_(—)0102, SpyM3_(—)0103, SpyM3_(—)0104, Sps0100,Sps0101, Sps0102, Sps0103, Sps0104, Sps0105, Sps0106, orf78, orf79,orf80, orf81, orf82, orf83, orf84, spyM18_(—)0126, spyM18_(—)0127,spyM18_(—)0128, spyM18_(—)0129, spyM18_(—)0130, spyM18_(—)0131,spyM18_(—)0132, SpyoM0100156, SpyoM0100155, SpyoM0100154, SpyoM01000153,SpyoM01000152, SpyoM01000151, SpyoM01000150, SpyoM01000149,ISS3040_fimbrial, ISS3776_fimbrial, and ISS4959_fimbrial. In oneembodiment, GAS AI-3 may include open reading frames encoding for two ormore of SpyM3_(—)0098, SpyM3_(—)0099, SpyM3_(—)0100, SpyM3_(—)0101,SpyM3_(—)0102, SpyM3_(—)0103, and SpyM3_(—)0104. Alternatively, GAS AI-3may include open reading frames encoding for two or more of Sps0100,Sps0101, Sps0102, Sps0103, Sps0104, Sps0105, and Sps0106. Alternatively,GAS AI-3 may include open reading frames encoding for two or more oforf78, orf79, orf80, orf81, orf82, orf83, and orf84. Alternatively, GASAI-3 may include open reading frames encoding for two or more ofspyM18_(—)0126, spyM18_(—)0127, spyM18_(—)0128, spyM18_(—)0129,spyM18_(—)0130, spyM18_(—)0131, and spyM18_(—)0132. Alternatively, GASAI-3 may include open reading frames encoding for two or more ofSpyoM01000156, SpyoM01000155, SpyoM01000154, SpyoM01000153,SpyoM01000152, SpyoM01000151, SpyoM01000150, and SpyoM01000149.Alternatively, GAS AI-1 may also include polynucleotide sequencesencoding for any one of ISS3040_fimbrial, ISS3776_fimbrial, andISS4959_fimbrial.

One or more of the GAS AI-3 polynucleotide sequences may be replaced bya polynucleotide sequence coding for a fragment of the replaced ORF.Alternatively, one or more of the GAS AI-3 open reading frames may bereplaced by a sequence having sequence homology (sequence identity) tothe replaced ORF.

One or more of the GAS AI-3 surface proteins typically include an LPXTGmotif (such as LPXTG (SEQ ID NO: 122)) or other sortase substrate motif.These sortase proteins are thought to be involved in the secretion andanchoring of the LPXTG containing surface proteins. GAS AI-3 may encodefor at least one surface protein. Alternatively, GAS AI-3 may encode forat least two surface proteins and at least one sortase. Preferably, GASAI-3 encodes for at least three surface proteins and at least twosortases. One or more of the surface proteins may include an LPXTGmotif.

GAS AI-3 preferably includes a srtC2 type sortase. GAS srtC2 typesortases may preferably anchor surface proteins with a QVPTG (SEQ ID NO:140) motif, particularly when the motif is followed by a hydrophobicregion and a charged C terminus tail. GAS SrtC2 may be differentiallyregulated by Nra.

The GAS AI-3 protein of the composition may be selected from the groupconsisting of SpyM3_(—)0098, SpyM3_(—)0099, SpyM3_(—)0100,SpyM3_(—)0101, SpyM3_(—)0102, SpyM3_(—)0103, SpyM3_(—)0104, Sps0100,Sps0101, Sps0102, Sps0103, Sps0104, Sps0105, Sps0106, orf78, orf79,orf80, orf81, orf82, orf83, orf84, spyM18_(—)0126, spyM18_(—)0127,spyM18_(—)0128, spyM18_(—)0129, spyM18_(—)0130, spyM18_(—)0131,spyM18_(—)0132, SpyoM01000156, SpyoM01000155, SpyoM01000154,SpyoM01000153, SpyoM01000152, SpyoM01000151, SpyoM01000150,SpyoM01000149, ISS3040_fimbrial, ISS3776_fimbrial, and ISS4959_fimbrial.GAS AI-3 surface proteins SpyM3_(—)0098, SpyM3_(—)0100, SpyM3_(—)0102,SpyM3_(—)0104, SPs0100, SPs0102, SPs0104, SPs0106, orf78, orf80, orf82,orf84, spyM18_(—)0126, spyM18_(—)0128, spyM18_(—)0130, spyM18_(—)0132,SpyoM01000155, SpyoM01000153, SpyoM01000151, SpyoM01000149,ISS3040_fimbrial, ISS3776_fimbrial, and ISS4959_fimbrial are preferredGAS AI-3 proteins for use in the immunogenic compositions of theinvention.

In addition to the open reading frames encoding the GAS AI-3 proteins,GAS AI-3 may also include a transcriptional regulator such as Nra.

GAS AI-3 may also include a LepA putative signal peptidase I protein.

The GAS AI-3 surface proteins may be used alone, in combination withother GAS AI-3 surface proteins or in combination with other GAS AIsurface proteins. Preferably, the immunogenic compositions of theinvention include the GAS AI-3 fimbrial protein, the GAS AI-3 collagenbinding protein, the GAS AI-3 surface protein (such as SpyM3_(—)0102,M3_Sps0104, M5_orf82, or spyM18_(—)0130), and fibronectin bindingprotein PrtF2. More preferably, the immunogenic compositions of theinvention include the GAS AI-3 fimbrial protein, the GAS AI-3 collagenbinding protein, and the GAS AI-3 surface protein. Still morepreferably, the immunogenic compositions of the invention include theGAS AI-3 fimbrial protein.

Representative examples of the GAS AI-3 fimbrial protein includeSpyM3_(—)0100, M3_Sps0102, M5_orf80, spyM18_(—)128, SpyoM01000153,ISS3040_fimbrial, ISS3776_fimbrial, ISS4959_fimbrial.

Representative examples of the GAS AI-3 collagen binding protein includeSpyM3_(—)0098, M3_Sps0100, M5_orf 78, spyM18_(—)0126, and SpyoM01000155.

Representative examples of the GAS AI-3 fibronectin binding proteinPrtF2 include SpyM3_(—)0104, M3_Sps0106, M5_orf84 and spyM18_(—)0132,and SpyoM01000149.

A fourth GAS adhesion island, “GAS Adhesin Island-4” or “GAS AI-4,” hasalso been identified in GAS serotypes. Amino acid sequences encoded bythe open reading frames of GAS AI-4 may also be used in immunogeniccompositions for the treatment or prevention of GAS infection.

A preferred immunogenic composition of the invention comprises a GASAI-4 surface protein which may be formulated or purified in anoligomeric (pilus) form. In a preferred embodiment, the oligomeric formis a hyperoligomer. A preferred immunogenic composition of the inventionalternatively comprises an isolated GAS AI-4 surface protein inoligomeric (pilus) form. The oligomer or hyperoligomeric pilusstructures comprising GAS AI-3 surface proteins may be purified orotherwise formulated for use in immunogenic compositions. The oligomericor hyperoligomeric pilus structures comprising GAS AI-4 surface proteinsmay be purified or otherwise formulated for use in immunogeniccompositions.

GAS AI-4 comprises a series of approximately eight open reading framesencoding for a collection of amino acid sequences comprising surfaceproteins and sortases (“GAS AI-4 proteins”). This GAS adhesin island 4(“GAS AI-4”) comprises surface proteins, a srtC2 sortase, and a RofAregulatory protein. GAS AI-4 surface proteins within may include afimbrial protein, F1 and F2 like fibronectin-binding proteins, and acapsular polysaccharide adhesion protein (Cpa). GAS AI-4 surfaceproteins may also include a hypothetical surface protein in an openreading frame (orf).

The fimbral protein (EftLSL) is thought to form the shaft portion of thepilus like structure, while the collagen adhesion protein (Cpa) and thehypothetical protein are thought to act as accessory proteinsfacilitating the formation of the pilus structure, exposed on thesurface of the bacterial capsule. Preferably, each of these GAS AI-4surface proteins include an LPXTG sortase substrate motif, such as LPXTG(SEQ ID NO: 122), VPXTG (SEQ ID NO: 137), QVXTG (SEQ ID NO: 138) orLPXAG (SEQ ID NO: 139).

Specifically, GAS AI-4 includes polynucleotide sequences encoding fortwo or more of 19224134, 19224135, 19224136, 19224137, 19224138,19224139, 19224140, and 19224141. A GAS AI-4 polynucleotide may alsoinclude polynucleotide sequences encoding for any one of20010296_fimbrial, 20020069_fimbrial, CDC SS 635_fimbrial,ISS4883_fimbrial, ISS4538_fimbrial. One or more of the GAS AI-4polynucleotide sequences may be replaced by a polynucleotide sequencecoding for a fragment of the replaced ORF. Alternatively, one or more ofthe GAS AI-4 open reading frames may be replaced by a sequence havingsequence homology (sequence identity) to the replaced ORF.

One or more of the GAS AI-4 surface proteins typically include an LPXTGmotif (such as LPXTG (SEQ ID NO: 122)) or other sortase substrate motif.These sortase proteins are thought to be involved in the secretion andanchoring of the LPXTG containing surface proteins. GAS AI-4 may encodefor at least one surface protein. Alternatively, GAS AI-4 may encode forat least two surface proteins and at least one sortase. Preferably, GASAI-4 encodes for at least three surface proteins and at least twosortases. One or more of the surface proteins may include an LPXTGmotif.

GAS AI-4 includes a SrtC2 type sortase. GAS SrtC2 type sortases maypreferably anchor surface proteins with a QVPTG (SEQ ID NO: 140) motif,particularly when the motif is followed by a hydrophobic region and acharged C terminus tail.

The GAS AI-4 protein of the composition may be selected from the groupconsisting of 19224134, 19224135, 19224136, 19224137, 19224138,19224139, 19224140, 19224141, 20010296_fimbrial, 20020069_fimbrial, CDCSS 635_fimbrial, ISS4883_fimbrial, and ISS4538_fimbrial. GAS AI-4surface proteins 19224134, 19224135, 19224137, 19224139, 19224141,20010296_fimbrial, 20020069_fimbrial, CDC SS 635_fimbrial,ISS4883_fimbrial, ISS4538_fimbrial are preferred proteins for use in theimmunogenic compositions of the invention.

In addition to the open reading frames encoding the GAS AI-4 proteins,GAS AI-4 may also include a divergently transcribed transcriptionalregulator such as RofA (i.e., the transcriptional regulator is locatednear or adjacent to the AI protein open reading frames, but ittranscribed in the opposite direction.

GAS AI-4 may also include a LepA putative signal peptidase I protein anda MsmRL protein. The GAS AI-4 surface proteins may be used alone, incombination with other GAS AI-4 surface proteins or in combination withother GAS AI surface proteins. Preferably, the immunogenic compositionsof the invention include the GAS AI-4 fimbrial protein (EftLSL or20010296_fimbrial, 20020069_fimbrial, CDC SS 635_fimbrial,ISS4883_fimbrial, or ISS4538_fimbrial), the GAS AI-4 collagen bindingprotein, the GAS AI-4 surface protein (such as M12 isolate A735 orf 2),and fibronectin binding protein PrtF1 and PrtF2. More preferably, theimmunogenic compositions of the invention include the GAS AI-4 fimbrialprotein, the GAS AI-4 collagen binding protein, and the GAS AI-4 surfaceprotein. Still more preferably, the immunogenic compositions of theinvention include the GAS AI-4 fimbrial protein.

The GAS AI proteins of the invention may be used in immunogeniccompositions for prophylactic or therapeutic immunization against GASinfection. For example, the invention may include an immunogeniccomposition comprising one or more GAS AI-1 proteins and one or more ofany of GAS AI-2, GAS AI-3, or GAS AI-4 proteins. For example, theinvention includes an immunogenic composition comprising at least twoGAS AI proteins where each protein is selected from a different GASadhesin island. The two GAS AI proteins may be selected from one of thefollowing GAS AI combinations: GAS AI-1 and GAS AI-2; GAS AI-1 and GASAI-3; GAS AI-1 and GAS AI-4; GAS AI-2 and GAS AI-3; GAS AI-2 and GASAI-4; and GAS AI 3 and GAS AI-4. Preferably the combination includesfimbrial proteins from one or more GAS adhesin islands.

The immunogenic compositions may also be selected to provide protectionagainst an increased range of GAS serotypes and strain isolates. Forexample, the immunogenic composition may comprise a first and second GASAI protein, wherein a full length polynucleotide sequence encoding forthe first GAS AI protein is not present in a genome comprising a fulllength polynucleotide sequence encoding for the second GAS AI protein.In addition, each antigen selected for use in the immunogeniccompositions will preferably be present in the genomes of multiple GASserotypes and strain isolates. Preferably, each antigen is present inthe genomes of at least two (i.e., 3, 4, 5, 6, 7, 8, 9, 10, or more) GASstrain isolates. More preferably, each antigen is present in the genomesof at least two (i.e., at least 3, 4, 5, or more) GAS serotypes.

Applicants have also identified adhesin islands within the genome ofStreptococcus pneumoniae. These adhesion islands are thought to encodesurface proteins which are important in the bacteria's virulence. Aminoacid sequence encoded by such S. pneumoniae Adhesin Islands may be usedin immunogenic compositions for the treatment or prevention of S.pneumoniae infection. Preferred immunogenic compositions of theinvention comprise a S. pneumoniae AI surface protein which has beenformulated or purified in an oligomeric (pilus) form. In a preferredembodiment, the oligomeric form is a hyperoligomer. A preferredimmunogenic composition of the invention alternatively comprises anisolated S. pneumoniae surface protein in oligomeric (pilus) form. Theoligomer or hyperoligomeric pilus structures comprising S. pneumoniaesurface proteins may be purified or otherwise formulated for use inimmunogenic compositions.

The S. pneumoniae Adhesin Islands generally include a series of openreading frames within a S. pneumoniae genome that encode for acollection of surface proteins and sortases. A S. pneumoniae AdhesinIsland may encode for an amino acid sequence comprising at least onesurface protein. Alternatively, the S. pneumoniae Adhesin Island mayencode for at least two surface proteins and at least one sortase.Preferably, a S. pneumoniae Adhesin Island encodes for at least threesurface proteins and at least two sortases. One or more of the surfaceproteins may include an LPTXG motif (such as LPXTG (SEQ ID NO: 122)) orother sortase substrate motif. One or more S. pneumoniae AI surfaceproteins may participate in the formation of a pilus structure on thesurface of the S. pneumoniae bacteria.

The S. pneumoniae Adhesin Islands of the invention preferably include adivergently transcribed transcriptional regulator. The transcriptionalregulator may regulate the expression of the S. pneumonaie AI operon. Anexample of a transcriptional regulator found in S. pneumoniae AIsequences is rlrA.

A schematic of the organization of a S. pneumoniae AI locus is providedin FIG. 137. The locus comprises open reading frames encoding atranscriptional regulator (rlrA), cell wall surface proteins (rrgA,rrgB, rrgc) and sortases (srt B, srtC, srtD).

S. pneumoniae AI sequences may be generally divided into two groups ofhomology, S. pneuamoniae AI-a and AI-b. S. pneumoniae strains thatcomprise AI-a include 14 CSR 10, 19A Hungary 6, 23 F Poland 15, 670, 6BFinland 12, and 6B Spain 2. S. pneumoniae AI strains that comprise AI-binclude 19F Taiwan 14, 9V Spain 3, 23F Taiwan 15 and TIGR 4.

S. pneumoniae AI from TIGR4 comprises a series of approximately sevenopen reading frames encoding for a collection of amino acid sequencescomprising surface proteins and sortases (“S. pneumoniae AI proteins”).Specifically, S. pneumoniae AI from TIGR4 includes polynucleotidesequences encoding for two or more of SP0462, SP0463, SP0464, SP0465,SP0466, SP0467, and SP0468.

One or more of the S. pneumoniae AI from TIGR4 polynucleotide sequencesmay be replaced by a polynucleotide sequence coding for a fragment ofthe replaced ORF. Alternatively, one or more of the S. pneumoniae AIfrom TIGR4 open reading frames may be replaced by a sequence havingsequence homology to the replaced ORF.

S. pneumoniae strain 670 AI comprises a series of approximately sevenopen reading frames encoding for a collection of amino acid sequencescomprising surface proteins and sortases (“S. pneumoniae AI proteins”).Specifically, S. pneumoniae strain 670 AI includes polynucleotidesequences encoding for two or more of orf1_(—)670, orf3_(—)670,orf4_(—)670, orf5_(—)670, orf6_(—)670, orf7_(—)670, and orf8_(—)670.

One or more of the S. pneumoniae strain 670 AI polynucleotide sequencesmay be replaced by a polynucleotide sequence coding for a fragment ofthe replaced ORF. Alternatively, one or more of the S. pneumoniae strain670 AI open reading frames may be replaced by a sequence having sequencehomology to the replaced ORF.

S. pneumoniae AI from 14 CSR10 comprises a series of approximately sevenopen reading frames encoding for a collection of amino acid sequencescomprising surface proteins and sortases (“S. pneumoniae AI proteins”).Specifically, S. pneumoniae AI from 14 CSR10 includes polynucleotidesequences encoding for two or more of ORF2_(—)14CSR, ORF3_(—)14CSR,ORF4_(—)14CSR, ORF5_(—)14CSR, ORF6_(—)14CSR, ORF7_(—)14CSR, andORF8_(—)14CSR.

One or more of the S. pneumoniae AI from 14 CSR10 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the S.pneumoniae AI from 14 CSR10 open reading frames may be replaced by asequence having sequence homology to the replaced ORF.

S. pneumoniae AI from 19A Hungary 6 comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases (“S. pneumoniae AIproteins”). Specifically, S. pneumoniae AI from 19A Hungary 6 includespolynucleotide sequences encoding for two or more of ORF2_(—)19AH,ORF3_(—)1 gAH, ORF4_(—)19AH, ORF5_(—)19AH, ORF6_(—)19AH, ORF7_(—)19AH,and ORF8_(—)19AH.

One or more of the S. pneumoniae AI from 19A Hungary 6 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the S.pneumoniae AI from 19A Hungary 6 open reading frames may be replaced bya sequence having sequence homology to the replaced ORF.

S. pneumoniae AI from 19F Taiwan 14 comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases (“S. pneumoniae AIproteins”). Specifically, S. pneumoniae AI from 19F Taiwan 14 includespolynucleotide sequences encoding for two or more of ORF2_(—)19FTW,ORF3_(—)19FTW, ORF4_(—)19FTW, ORF5_(—)19FTW, ORF6_(—)19FTW,ORF7_(—)19FTW, and ORF8_(—)19FTW.

One or more of the S. pneumoniae AI from 19F Taiwan 14 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the S.pneumoniae AI from 19F Taiwan 14 open reading frames may be replaced bya sequence having sequence homology to the replaced ORF.

S. pneumoniae AI from 23F Poland 16 comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases (“S. pneumoniae AIproteins”). Specifically, S. pneumoniae AI from 23F Poland 16 includespolynucleotide sequences encoding for two or more of ORF2_(—)23FP,ORF3_(—)23FP, ORF4_(—)23FP, ORF5_(—)23FP, ORF6_(—)23FP, ORF7_(—)23FP,and ORF8_(—)23FP.

One or more of the S. pneumoniae AI from 23F Poland 16 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the S.pneumoniae AI from 23F Poland 16 open reading frames may be replaced bya sequence having sequence homology to the replaced ORF.

S. pneumoniae AI from 23F Taiwan 15 comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases (“S. pneumoniae AIproteins”). Specifically, S. pneumoniae AI from 23F Taiwan 15 includespolynucleotide sequences encoding for two or more of ORF2_(—)23FTW,ORF3_(—)23FTW, ORF4_(—)23FTW, ORF5_(—)23FTW, ORF6_(—)23FTW,ORF7_(—)23FTW, and ORF8_(—)23FTW.

One or more of the S. pneumoniae AI from 23F Taiwan 15 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the S.pneumoniae AI from 23F Taiwan 15 open reading frames may be replaced bya sequence having sequence homology to the replaced ORF.

S. pneumoniae AI from 6B Finland 12 comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases (“S. pneumoniae AIproteins”). Specifically, S. pneumoniae AI from 6B Finland 12 includespolynucleotide sequences encoding for two or more of ORF2_(—)6BF,ORF3_(—)6BF, ORF4_(—)6BF, ORF5_(—)6BF, ORF6_(—)6BF, ORF7_(—)6BF, andORF8_(—)6BF.

One or more of the S. pneumoniae AI from 6B Finland 12 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the S.pneumoniae AI from 6B Finland 12 open reading frames may be replaced bya sequence having sequence homology to the replaced ORF.

S. pneumoniae AI from 6B Spain 2 comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases (“S. pneumoniae AIproteins”). Specifically, S. pneumoniae AI from 6B Spain 2 includespolynucleotide sequences encoding for two or more of ORF2_(—)6BSP,ORF3_(—)6BSP, ORF4_(—)6BSP, ORF5_(—)6BSP, ORF6_(—)6BSP, ORF7_(—)6BSP,and ORF8_(—)6BSP.

One or more of the S. pneumoniae AI from 6B Spain 2 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the S.pneumoniae AI from 6B Spain 2 open reading frames may be replaced by asequence having sequence homology to the replaced ORF.

S. pneumoniae AI from 9V Spain 3 comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases (“S. pneumoniae AIproteins”). Specifically, S. pneumoniae AI from 9V Spain 3 includespolynucleotide sequences encoding for two or more of ORF2_(—)9VSP,ORF3_(—)9VSP, ORF4_(—)9VSP, ORF5_(—)9VSP, ORF6_(—)9VSP, ORF7_(—)9VSP,and ORF8_(—)9VSP.

One or more of the S. pneumoniae AI from 9V Spain 3 polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the S.pneumoniae AI from 9V Spain 3 open reading frames may be replaced by asequence having sequence homology to the replaced ORF.

One or more of the S. pneumoniae AI surface proteins typically includean LPXTG motif (such as LPXTG (SEQ ID NO: 122)) or other sortasesubstrate motif. These sortase proteins are thought to be involved inthe secretion and anchoring of the LPXTG containing surface proteins. S.pneumoniae AI may encode for at least one surface protein. The AdhesinIsland, may encode at least one surface protein. Alternatively, S.pneumoniae AI may encode for at least two surface proteins and at leastone sortase. Preferably, S. pneumoniae AI encodes for at least threesurface proteins and at least two sortases. One or more of the surfaceproteins may include an LPXTG motif.

The S. pneumoniae AI protein of the composition may be selected from thegroup consisting of SP0462, SP0463, SP0464, SP0465, SP0466, SP0467,SP0468, orf1_(—)670, orf5_(—)670, orf4_(—)670, orf5_(—)670, orf6_(—)670,orf7_(—)670, orf8_(—)670, ORF2_(—)14CSR, ORF3_(—)14CSR, ORF4_(—)14CSR,ORF5_(—)14CSR, ORF6_(—)14CSR, ORF7_(—)14CSR, ORF8_(—)14CSR,ORF2_(—)19AH, ORF3_(—)19AH, ORF4_(—)19AH, ORF5_(—)9AH, ORF6_(—)19AH,ORF7_(—)19AH, ORF8_(—)19AH, ORF2_(—)19FTW, ORF3_(—)19FTW, ORF4_(—)19FTW,ORF5_(—)19FTW, ORF6_(—)19FTW, ORF7_(—)19FTW, ORF8_(—)19FTW,ORF2_(—)23FP, ORF3_(—)23FP, ORF4_(—)23FP, ORF5_(—)23FP, ORF6_(—)23FP,ORF7_(—)23FP, ORF8_(—)23FP, ORF2_(—)23FTW, ORF3_(—)23FTW, ORF4_(—)23FTW,ORF5_(—)23FTW, ORF6_(—)23FTW, ORF7_(—)23FTW, ORF8_(—)23FTW, ORF2_(—)6BF,ORF3_(—)6BF, ORF4_(—)6BF, ORF5_(—)6BF, ORF6_(—)6BF, ORF7_(—)6BF,ORF8_(—)6BF, ORF2_(—)6BSP, ORF3_(—)6BSP, ORF4_(—)6BSP, ORF5_(—)6BSP,ORF6_(—)6BSP, ORF7_(—)6BSP, ORF8_(—)6BSP, ORF2_(—)9VSP, ORF3_(—)9VSP,ORF4_(—)9VSP, ORF5_(—)9VSP, ORF6_(—)9VSP, ORF7_(—)9VSP and,ORF8_(—)9VSP.

S. pneumoniae AI surface proteins are preferred proteins for use in theimmunogenic compositions of the invention. In one embodiment, thecompositions of the invention comprise combinations of two or more Spneumoniae AI surface proteins. Preferably such combinations areselected from two or more of the group consisting of SP0462, SP0463,SP0464, orf3_(—)670, orf4_(—)670, orf5_(—)670, ORF3_(—)14CSR,ORF4_(—)14CSR, ORF5_(—)14CSR, ORF3_(—)19AH, ORF4_(—)19AH, ORF5_(—)19AH,ORF3_(—)19FTW, ORF4_(—)19FTW, ORF5_(—)1 gFTW, ORF3_(—)23FP,ORF4_(—)23FP, ORF5_(—)23FP, ORF3_(—)23FTW, ORF4_(—)23FTW, ORF5_(—)23FTW,ORF3_(—)6BF, ORF4_(—)6BF, ORF5_(—)6BF, ORF3_(—)6BSP, ORF4_(—)6BSP,ORF5_(—)6BSP, ORF3_(—)9VSP, ORF4_(—)9VSP, and ORF5_(—)9VSP.

In addition to the open reading frames encoding the S. pneumoniae AIproteins, S. pneumoniae AI may also include a transcriptional regulator.

The S. pneumoniae AI proteins of the invention may be used inimmunogenic compositions for prophylactic or therapeutic immunizationagainst S. pneumoniae infection. For example, the invention may includean immunogenic composition comprising one or more S. pneumoniae fromTIGR4 AI proteins and one or more S. pneumoniae strain 670 proteins. Theimmunogenic composition may comprise one or more AI proteins from anyone or more of S. pneumoniae strains TIGR4, 19A Hungary 6, 6B Finland12, 6B Spain 2, 9V Spain 3, 14 CSR 10, 19F Taiwan 14, 23F Taiwan 15, 23FPoland 16, and 670.

The immunogenic compositions may also be selected to provide protectionagainst an increased range of S. pneumoniae serotypes and strainisolates. For example, the immunogenic composition may comprise a firstand second S. pneumoniae AI protein, wherein a full lengthpolynucleotide sequence encoding for the first S. pneumoniae AI proteinis not present in a genome comprising a full length polynucleotidesequence encoding for the second S. pneumoniae AI protein. In addition,each antigen selected for use in the immunogenic compositions willpreferably be present in the genomes of multiple S. pneumoniae serotypesand strain isolates. Preferably, each antigen is present in the genomesof at least two (i.e., 3, 4, 5, 6, 7, 8, 9, 10, or more) S. pneumoniaestrain isolates. More preferably, each antigen is present in the genomesof at least two (i.e., at least 3, 4, 5, or more) S. pneumoniaeserotypes.

The immunogenic compositions may also be selected to provide protectionagainst an increased range of serotypes and strain isolates of a Grampositive bacteria. For example, the immunogenic composition may comprisea first and second Gram positive bacteria AI protein, wherein a fulllength polynucleotide sequence encoding for the first Gram positivebacteria AI protein is not present in a genome comprising a full lengthpolynucleotide sequence encoding for the second Gram positive bacteriaAI protein. In addition, each antigen selected for use in theimmunogenic compositions will preferably be present in the genomes ofmultiple serotypes and strain isolates of the Gram positive bacteria.Preferably, each antigen is present in the genomes of at least two(i.e., 3, 4, 5, 6, 7, 8, 9, 10, or more) Gram positive bacteria strainisolates. More preferably, each antigen is present in the genomes of atleast two (i.e., at least 3, 4, 5, or more) Gram positive bacteriaserotypes. One or both of the first and second AI proteins maypreferably be in oligomeric or hyperoligomeric form.

Adhesin island surface proteins from two or more Gram positive bacterialgenus or species may be combined to provide an immunogenic compositionfor prophylactic or therapeutic treatment of disease or infection of twomore Gram positive bacterial genus or species. Optionally, the adhesinisland surface proteins may be associated together in an oligomeric orhyperoligomeric structure.

In one embodiment, the invention comprises adhesin island surfaceproteins from two or more Streptococcus species. For example, theinvention includes a composition comprising a GBS AI surface protein anda GAS adhesin island surface protein. As another example, the inventionincludes a composition comprising a GAS adhesin island surface proteinand a S. pneumoniae adhesin island surface protein. One or both of theGAS AI surface protein and the S. pneumoniae AI surface protein may bein oligomeric or hyperoligomeric form. As a further example, theinvention includes a composition comprising a GBS adhesin island surfaceprotein and a S. pneumoniae adhesin island surface protein.

In one embodiment, the invention comprises an adhesin island surfaceprotein from two or more Gram positive bacterial genus. For example, theinvention includes a composition comprising a Streptococcus adhesinisland protein and a Corynebacterium adhesin island protein. One or moreof the Gram positive bacteria AI surface proteins may be in anoligomeric or hyperoligomeric form.

In addition, the AI polynucleotides and amino acid sequences of theinvention may also be used in diagnostics to identify the presence orabsence of GBS (or a Gram positive bacteria) in a biological sample.They may be used to generate antibodies which can be used to identifythe presence of absence of an AI protein in a biological sample or in aprophylactic or therapeutic treatment for GBS (or a Gram positivebacterial) infection. Further, the AI polynucleotides and amino acidsequences of the invention may also be used to identify small moleculecompounds which inhibit or decrease the virulence associated activity ofthe AI.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 presents a schematic depiction of Adhesin Island 1 (“AI-1”)comprising open reading frames for GBS 80, GBS 52, SAG0647, SAG0648 andGBS 104.

FIG. 2 illustrates the identification of AI-1 sequences in several GBSserotypes and strain isolates (GBS serotype V, strain isolate 2603; GBSserotype III, strain isolate nem316; GBS serotype II, strain isolate18RS21; GBS serotype V, strain isolate CJB111; GBS serotype III, strainisolate COH1 and GBS serotype 1a, strain isolate A909). (An AI-1 was notidentified in GBS serotype 1b, strain isolate H36B or GBS serotype 1a,strain isolate 515).

FIG. 3 presents a schematic depiction of the correlation between AI-1and the Adhesin Island 2 (“AI-2”) within the GBS serotype V, strainisolate 2603 genome. (This AI-2 comprises open reading frames for GBS67, GBS 59, SAG1406, SAG1405 and GBS 150).

FIG. 4 illustrates the identification of AI-2 comprising open readingframes encoding for GBS 67, GBS 59, SAG1406, SAG1404 and GBS 150 (orsequences having sequence homology thereto) in several GBS serotypes andstrain isolates (GBS serotype V, strain isolate 2603; GBS serotype III,strain isolate NEM316; GBS serotype 1b, strain isolate H36B; GBSserotype V, strain isolate CJB111; GBS serotype II, strain isolate18RS21; and GBS serotype 1a, strain isolate 515).

FIG. 4 also illustrates the identification of AI-2 comprising openreading frames encoding for 01520 (a sortase), 01521, 01522 (a sortase),01523 (spb1), 01524 and 01525 (or sequences having sequence homologythereto).

FIG. 5 presents data showing that GBS 80 binds to fibronectin andfibrinogen in ELISA.

FIG. 6 illustrates that all genes in AI-1 are co-transcribed as anoperon.

FIG. 7 presents schematic depictions of in-frame deletion mutationswithin AI-1.

FIG. 8 presents FACS data showing that GBS 80 is required for surfacelocalization of GBS 104.

FIG. 9 presents FACS data showing that sortases SAG0647 and SAG0648 playa semi-redundant role in surface exposure of GBS 80 and GBS 104.

FIG. 10 presents Western Blots of the in-frame deletion mutants probedwith anti-GBS80 and anti-GBS 104 antisera.

FIG. 11: Electron micrograph of surface exposed pili structures inStreptococcus agalactiae containing GBS 80.

FIG. 12: PHD predicted secondary structure of GBS 067.

FIGS. 13, 14 and 15: Electron micrograph of surface exposed pilistructures of strain isolate COH1 of Streptococcus agalactiae containinga plasmid insert encoding GBS 80.

FIGS. 16 and 17: Electron micrograph of surface exposed pili structureof wild type strain isolate COH1 of Streptococcus agalactiae.

FIG. 18: Alignment of polynucleotide sequences of AI-1 from serotype V,strain isolates 2603 and CJB111; serotype II, strain isolate 18RS21;serotype HI, strain isolates COH1 and NEM316; and serotype 1a, strainisolate A909.

FIG. 19: Alignment of polynucleotide sequences of AI-2 from serotype V,strain isolates 2603 and CJB111; serotype II, strain isolate 18RS21;serotype 1b, strain isolate H36B; and serotype 1a, strain isolate 515.

FIG. 20: Alignment of polynucleotide sequences of AI-2 from serotype V,strain isolate 2603 and serotype III, strain isolate NEM316.

FIG. 21: Alignment of polynucleotide sequences of AI-2 from serotypeIII, strain isolate COH1 and serotype Ia, strain isolate A909.

FIG. 22: Alignment of amino acid sequences of AI-1 surface protein GBS80 from serotype V, strain isolates 2603 and CJB111; serotype 1a, strainisolate A909; serotype III, strain isolates COH1 and NEM316.

FIG. 23: Alignment of amino acid sequences of AI-1 surface protein GBS104 from serotype V, strain isolates 2603 and CJB111; serotype III,strain isolates COH1 and NEM316; and serotype II, strain isolate 18RS21.

FIG. 24: Alignment of amino acid sequences of AI-2 surface protein GBS067 from serotype V, strain isolates 2603 and CJB111; serotype 1a,strain isolate 515; serotype II, strain isolate 18RS21; serotype Ib,strain isolate H36B; and serotype III, strain isolate NEM316.

FIG. 25: Illustrates that GBS closely associates with tight junctionsand cross the monolayer of ME180 cervical epithelial cells by aparacellular route.

FIG. 26: Illustrates GBS infection of ME180 cells.

FIG. 27: Illustrates that GBS 80 recombinant protein does not bind toepithelial cells.

FIG. 28: Illustrates that deletion of GBS 80 does not effect thecapacity of GBS strain 2603 V/R to adhere and invade ME180 cervicalepithelial cells.

FIG. 29: Illustrates binding of recombinant GBS 104 protein toepithelial cells.

FIG. 30: Illustrates that deletion of GBS 104 in the GBS strain COH1,reduces the capacity of GBS to adhere to ME180 cervical epithelialcells.

FIG. 31: Illustrates that GBS 80 knockout mutant strain partially losesthe ability to translocate through an epithelial cell monolayer.

FIG. 32: Illustrates that deletion of GBS 104, but not GBS 80, reducesthe capacity of GBS to invade J774 macrophage-like cell line.

FIG. 33: Illustrates that GBS 104 knockout mutant strain translocatesthrough an epithelial monolayer less efficiently than the isogenic wildtype.

FIG. 34: Negative stained electron micrographs of GBS serotype III,strain isolate COH1, containing a plasmid insert to over-express GBS 80.

FIG. 35: Electron micrographs of surface exposed pili structures on GBSserotype III, strain isolate COH1, containing a plasmid insert toover-express GBS 80, stained with anti-GBS 80 antibodies (visualizedwith 10 nm gold particles).

FIG. 36: Electron micrographs of surface exposed pili structures on GBSserotype III, strain isolate COH1, containing a plasmid insert toover-express GBS 80, stained with anti-GBS 80 antibodies (visualizedwith 10 nm gold particles).

FIG. 37: Electron micrographs of surface exposed pili structures on GBSserotype III, strain isolate COH1, containing a plasmid insert toover-express GBS 80, stained with anti-GBS 80 antibodies (visualizedwith 20 nm gold particles).

FIG. 38: Electron micrographs of surface exposed pili structures on GBSserotype III, strain isolate COH1, containing a plasmid insert toover-express GBS 80, stained with anti-GBS 104 antibodies or preimmunesera (visualized with 10 nm gold particles).

FIG. 39: Electron micrographs of surface exposed pili structures on GBSserotype III, strain isolate COH1, containing a plasmid insert toover-express GBS 80, stained with anti-GBS 80 antibodies (visualizedwith 20 nm gold particles) and anti-GBS 104 antibodies (visualized with10 nm gold particles).

FIG. 40: Electron micrographs of surface exposed pili structures on GBSserotype III, strain isolate COH1, containing a plasmid insert toover-express GBS 80, stained with anti-GBS 80 antibodies (visualizedwith 20 nm gold particles) and anti-GBS 104 antibodies (visualized with10 nm gold particles).

FIG. 41: Illustrates that GBS 80 is necessary for polymer formation andGBS104 and sortase SAG0648 are necessary for efficient assembly of pili.

FIG. 42: Illustrates that GBS 67 is part of a second pilus and that GBS80 is polymerized in strain 515.

FIG. 43: Illustrates that two macro-molecules are visible in Coh1, oneof which is the GBS 80 pilin.

FIG. 44: Illustrates pilin assembly.

FIG. 45: Illustrates that GBS 52 is a minor component of the GBS pilus.

FIG. 46: Illustrates that the pilus is found in the supernatant of abacterial culture.

FIG. 47: Illustrates that the pilus is found in the supernatant ofbacterial cultures in all phases.

FIG. 48: Illustrates that in Coh1, only the GBS 80 protein and onesortase (sag0647 or sag0648) is required for polymerization.

FIG. 49: IEM image of GBS 80 staining of a GBS serotype VIII strainJM9030013 that express pili.

FIG. 50: IEM image of GBS 104 staining of a GBS serotype VIII strainJM9030013 that express pili.

FIG. 51A: Schematic depiction of open reading frames comprising a GASAI-2 serotype M1 isolate, GAS AI-3 serotype M3, M5, M18, and M49isolates, a GAS AI-4 serotype M12 isolate, and an GAS AI-1 serotype M6isolate.

FIG. 51B: Amino acid alignment of SrtC1-type sortase of a GAS AI-2serotype M1 isolate, SrtC2-type sortases of serotype M3, M5, M118, andM49 isolates, and a SrtC2-type sortase of a GAS AI-4 serotype M12isolate.

FIG. 52: Amino acid alignment of the capsular polysaccharide adhesionproteins of GAS AI-4 serotype M12 (A735), GAS AI-3 serotype M5(Manfredo), S. pyogenes strain MGAS315 serotype M3, S. pyogenes strainSSI-1 serotype M3, S. pyogenes strain MGAS8232 serotype M3, and GAS AI-2serotype M1.

FIG. 53: Amino acid alignment of F-like fibronectin-binding proteins ofGAS AI-4 serotype M12 (A735) and S. pyogenes strain MGAS10394 serotypeM6.

FIG. 54: Amino acid alignment of F2-like fibronectin-binding proteins ofGAS AI-4 serotype M12 (A735), S. pyogenes strain MGAS8232 serotype M3,GAS AI-3 strain M5 (Manfredo), S. pyogenes strain SSI serotype M3, andS. pyogenes strain MGAS315 serotype M3.

FIG. 55: Amino acid alignment of fimbrial proteins of GAS AI-4 serotypeM12 (A735), GAS AI-3 serotype M5 (Manfredo), S. pyogenes strain MGAS315serotype M3, S. pyogenes strain SSI serotype M3, S. pyogenes strainMGAS8232 serotype M3, and S. pyogenes M1 GAS serotype M1.

FIG. 56: Amino acid alignment of hypothetical proteins of GAS AI-4serotype M12 (A735), S. pyogenes strain MGAS315 serotype M3, S. pyogenesstrain SSI-1 serotype M3, GAS AI-3 serotype M5 (Manfredo), and S.pyogenes strain MGAS8232 serotype M3.

FIG. 57: Results of FASTA homology search for amino acid sequences thatalign with the collagen adhesion protein of GAS AI-1 serotype M6(MGAS10394).

FIG. 58: Results of FASTA homology search for amino acid sequences thatalign with the fimbrial structural subunit of GAS AI-1 serotype M6(MGAS10394).

FIG. 59: Results of FASTA homology search for amino acid sequences thatalign with the hypothetical protein of GAS AI-2 serotype M1 (SF370).

FIG. 60: Specifies pilin and E box motifs present in GAS type 3 and 4adhesin islands.

FIG. 61: Illustrates that surface expression of GBS 80 protein on GBSstrains COH and JM9130013 correlates with formation of pili structures.Surface expression of GBS 80 was determined by FACS analysis using anantibody that cross-hybridizes with GBS 80. Formation of pili structureswas determined by immunogold electron microscopy using gold-labelledanti-GBS 80 antibody.

FIG. 62: Illustrates that surface exposure is capsule-dependent for GBS322 but not for GBS 80.

FIG. 63: Illustrates the amino acid sequence identity of GBS 59 proteinsin GBS strains.

FIG. 64: Western blotting of whole GBS cell extracts with anti-GBS 59antibodies.

FIG. 65: Western blotting of purified GBS 59 and whole GBS cell extractswith anti-GBS 59 antibodies.

FIG. 66: FACS analysis of GBS strains CJB111, 7357B, 515 using GBS 59antiserum.

FIG. 67: Illustrates that anti-GBS 59 antibodies are opsonic for CJB111GBS strain serotype V.

FIG. 68: Western blotting of GBS strain JM9130013 total extracts.

FIG. 69: Western blotting of GBS stain 515 total extracts shows that GBS67 and GBS 150 are parts of a pilus.

FIG. 70: Western blotting of GBS strain 515 knocked out for GBS 67expression

FIG. 71: FACS analysis of GBS strain 515 and GBS strain 515 knocked outfor GBS 67 expression using GBS 67 and GBS 59 antiserum.

FIG. 72: Illustrates complementation of GBS 515 knocked out for GBS 67expression with a construct overexpressing GBS 80.

FIG. 73: FACS analysis of GAS serotype M6 for spyM6_(—)0159 surfaceexpression.

FIG. 74: FACS analysis of GAS serotype M6 for spyM6_(—)0160 surfaceexpression.

FIG. 75: FACS analysis of GAS serotype M1 for GAS 15 surface expression.

FIG. 76: FACS analysis of GAS serotype M1 for GAS 16 surface expressionusing a first anti-GAS 16 antiserum.

FIG. 77: FACS analysis of GAS serotype M1 for GAS 18 surface expressionusing a first anti-GAS 18 antiserum.

FIG. 78: FACS analysis of GAS serotype M1 for GAS 18 surface expressionusing a second anti-GAS 18 antiserum.

FIG. 79: FACS analysis of GAS serotype M1 for GAS 16 surface expressionusing a second anti-GAS 16 antisera.

FIG. 80: FACS analysis of GAS serotype M3 for spyM3_(—)0098 surfaceexpression.

FIG. 81: FACS analysis of GAS serotype M3 for spyM3_(—)0100 surfaceexpression.

FIG. 82: FACS analysis of GAS serotype M3 for spyM3_(—)0102 surfaceexpression.

FIG. 83: FACS analysis of GAS serotype M3 for spyM3_(—)0104 surfaceexpression.

FIG. 84: FACS analysis of GAS serotype M3 for spyM3_(—)0106 surfaceexpression.

FIG. 85: FACS analysis of GAS serotype M12 for 19224134 surfaceexpression.

FIG. 86: FACS analysis of GAS serotype M12 for 19224135 surfaceexpression.

FIG. 87: FACS analysis of GAS serotype M12 for 19224137 surfaceexpression.

FIG. 88: FACS analysis of GAS serotype M12 for 19224141 surfaceexpression.

FIG. 89: Western blot analysis of GAS 15 expression on GAS M1 bacteria.

FIG. 90: Western blot analysis of GAS 15 expression using GAS 15 immunesera.

FIG. 91: Western blot analysis of GAS 15 expression using GAS 15pre-immune sera.

FIG. 92: Western blot analysis of GAS 16 expression on GAS M1 bacteria.

FIG. 93: Western blot analysis of GAS 16 expression using GAS 16 immunesera.

FIG. 94: Western blot analysis of GAS 16 expression using GAS 16pre-immune sera.

FIG. 95: Western blot analysis of GAS 18 on GAS M1 bacteria.

FIG. 96: Western blot analysis of GAS 18 using GAS 18 immune sera.

FIG. 97: Western blot analysis of GAS 18 using GAS 18 pre-immune sera.

FIG. 98: Western blot analysis of M6_Spy0159 expression on GAS bacteria.

FIG. 99: Western blot analysis of 19224135 expression on M12 GASbacteria.

FIG. 100: Western blot analysis of 19224137 expression on M12 GASbacteria.

FIG. 101: Full length nucleotide sequence of an S. pneumoniae strain 670μl.

FIG. 102: Western blot analysis of GAS 15, GAS 16, and GAS 18 in GAS M1strain 2580.

FIG. 103: Western blot analysis of GAS 15, GAS 16, and GAS 18 in GAS M1strain 2913.

FIG. 104: Western blot analysis of GAS 15, GAS 16, and GAS 18 in GAS M1strain 3280.

FIG. 105: Western blot analysis of GAS 15, GAS 16, and GAS 18 in GAS M1strain 3348.

FIG. 106: Western blot analysis of GAS 15, GAS 16, and GAS 18 in GAS M1strain 2719.

FIG. 107: Western blot analysis of GAS 15, GAS 16, and GAS 18 in GAS M1strain SF370.

FIG. 108: Western blot analysis of 19224135 and 19224137 in GAS M12strain 2728.

FIG. 109: Western blot analysis of 19224139 in GAS M12 strain 2728 usingantisera raised against SpyM3_(—)0102.

FIG. 110: Western blot analysis of M6_Spy0159 and M6_Spy0160 in GAS M6strain 2724.

FIG. 111: Western blot analysis of M6 Spy0159 and M6 Spy0160 in GAS M6strain SF370.

FIG. 112: Western blot analysis of M6_Spy160 in GAS M6 strain 2724.

FIGS. 113-115: Electron micrographs of surface exposed GAS 15 on GAS M1strain SF370.

FIGS. 116-121: Electron micrographs of surface exposed GAS 16 on GAS M1strain SF370.

FIGS. 122-125: Electron micrographs of surface exposed GAS 18 on GAS M1strain SF370 detected using anti-GAS 18 antisera.

FIG. 126: IEM image of a hyperoligomer on GAS M1 strain SF370 detectedusing anti-GAS 18 antisera.

FIGS. 127-132: IEM images of oligomeric and hyperoligomeric structurescontaining M6_Spy0160 extending from the surface of GAS serotype M63650.

FIGS. 133A and B: Western blot analysis of L. lactis transformed toexpress GBS 80 with anti-GBS 80 antiserum.

FIG. 134: Western blot analyses of L. lactis transformed to express GBSAI-1 with anti-GBS 80 antiserum.

FIG. 135: Ponceau staining of same acrylamide gel as used in FIG. 134.

FIG. 136A: Western blot analysis of sonicated pellets and supernatantsof cultured L. lactis transformed to express GBS AI-1 polypeptides usinganti-GBS 80 antiserum.

FIG. 136B: Polyacrylamide gel electrophoresis of sonicated pellets andsupernatants of cultured L. lactis transformed to express GBS AIpolypeptides.

FIG. 137: Depiction of an example S. pneumoniae AI locus.

FIG. 138: Schematic of primer hybridization sites within the S.pneumoniae AI locus of FIG. 137.

FIG. 139A: The set of amplicons produced from the S. pneumoniae strainTIGR4 AI locus.

FIG. 139B: Base pair lengths of amplicons produced from FIG. 139Aprimers in S. pneumoniae strain TIGR4.

FIG. 140: CGH analysis of S. pneumoniae strains for the AI locus.

FIG. 141: Amino acid sequence alignment of polypeptides encoded by AIorf 2 in S. pneumoniae AI-positive strains.

FIG. 142: Amino acid sequence alignment of polypeptides encoded by AIorf 3 in S. pneumoniae AI-positive strains.

FIG. 143: Amino acid sequence alignment of polypeptides encoded by AIorf 4 in S. pneumoniae AI-positive strains.

FIG. 144: Amino acid sequence alignment of polypeptides encoded by AIorf 5 in S. pneumoniae AI-positive strains.

FIG. 145: Amino acid sequence alignment of polypeptides encoded by AIorf 6 in S. pneumoniae AI-positive strains.

FIG. 146: Amino acid sequence alignment of polypeptides encoded by AIorf 7 in S. pneumoniae AI-positive strains.

FIG. 147: Amino acid sequence alignment of polypeptides encoded by AIorf 8 in S. pneumoniae AI-positive strains.

FIG. 148: Diagram comparing amino acid sequences of RrgA in S.pneumoniae strains.

FIG. 149: Amino acid sequence comparison of RrgB S. pneumoniae strains.

FIG. 150A: Sp0462 amino acid sequence.

FIG. 150B: Primers used to produce a clone encoding the Sp0462polypeptide.

FIG. 151A: Schematic depiction of recombinant Sp0462 polypeptide.

FIG. 151B: Schematic depiction of full-length Sp0462 polypeptide.

FIG. 152A: Western blot probed with serum obtained from S.pneumoniae-infected patients for Sp0462.

FIG. 152B: Western blot probed with GBS 80 serum for Sp0462.

FIG. 153A: Sp0463 amino acid sequence.

FIG. 153B: Primers used to produce a clone encoding the Sp0463polypeptide.

FIG. 154A: Schematic depiction of recombinant Sp0463 polypeptide.

FIG. 154B: Schematic depiction of full-length Sp0463 polypeptide.

FIG. 155: Western blot detection of recombinant Sp0463 polypeptide.

FIG. 156: Western blot detection of high molecular weight Sp0463polymers.

FIG. 157A: Sp0464 amino acid sequence.

FIG. 157B: Primers used to produce a clone encoding the Sp0464polypeptide.

FIG. 158A: Schematic depiction of recombinant Sp0464 polypeptide.

FIG. 158B: Schematic depiction of full-length Sp0464 polypeptide.

FIG. 159: Western blot detection of recombinant Sp0464 polypeptide.

FIG. 160: Amplification products prepared for production of Sp0462,Sp0463, and Sp0464 clones.

FIG. 161: Opsonic killing by anti-sera raised against L. lactisexpressing GBS AI

FIG. 162: Schematic depicting GAS adhesin islands GAS AI-1, GAS AI-2,GAS AI-3 and GAS AI-4.

FIGS. 163 A-D: Immunoblots of cell-wall fractions of GAS strains withantisera specific for LPXTG proteins of M6_ISS3650 (A), M1_SF370 (B),M5_ISS4883 (C) and M12_(—)20010296 (D).

FIGS. 163 E-H: Immunoblots of cell-wall fractions of deletion mutantsM1_SF370Δ128 (E) M1_SF370Δ130 (F) M1_SF370ΔSrtC1 (G) and the M1_(—)128deletion strain complemented with plasmid pAM::128 which contains theM1_(—)128 gene (H) with antisera specific for the pilin components ofM1_SF370.

FIGS. 163 I-N: Immunogold labelling and transmission electron microscopyof: T6 (I) and Cpa (J) in M6_ISS3650; M1_(—)128 in M1_SF370 (K) anddeletion strain M1_SF370Δ128 (N); M5_orf80 in M5_ISS4883 (L);M12_EftLSL.A in M12_(—)20010296 (M). The strains used are indicatedbelow the panels. Bars=200 nm.

FIG. 164: Schematic representation of the FCT region from 7 GAS strains

FIGS. 165 A-H: Flow cytometry of GAS bacteria treated or not withtrypsin and stained with sera specific for the major pilus component.Preimmune staining; black lines, untreated bacteria; green lines andtrypsin treated bacteria; blue lines. M6_ISS3650 stained with sera whichrecognize the M6 protein (A) or anti-M6_T6 (B), M1_SF370 stained withanti-M1 (C) or anti-M1_(—)128 (D), M5_ISS4883 stained with anti-PrtF (E)or anti-M5_orf80 (F) and M12_(—)20010296 with anti-M12 (G) oranti-EftLSL.A (H)

FIGS. 166 A-C: Immunoblots of recombinant pilin components withpolyvalent Lancefield T-typing sera. The recombinant proteins are shownabove the blot and the sera pool used is shown below the blot.

FIGS. 166 D-G: Immunoblots of pilin proteins with monovalent T-typingsera. The recombinant proteins are shown below the blot and the seraused above the blot.

FIG. 166 H and I Flow cytometry analysis of strain M1_SF370 (H) and thedeletion strain M1_SF370Δ128 (I) with T-typing antisera pool T.

FIG. 167: Chart describing the number and type of sortase sequencesidentified within GAS AIs.

FIG. 168 A: Immunogold-electronmicroscopy of L. lactis lacking anexpression construct for GBS AI-1 using anti-GBS 80 antibodies.

FIG. 168 B and C: Immunogold-electronmicroscopy detects GBS 80 inoligomeric (pilus) structures on surface of L. lactis transformed toexpress GBS AI-1

FIG. 169: FACS analysis detects expression of GBS 80 and GBS 104 on thesurface of L. lactis transformed to express GBS AI-1.

FIG. 170: Phase contrast microscopy and immuno-electronmicroscopy showsthat expression of GBS AI-1 in L. lactis induces L. lactis aggregation.

FIG. 171: Purification of GBS pili from L. lactis transformed to expressGBS AI-1.

FIG. 172: Schematic depiction of GAS M6 (AI-1), M1 (AI-2), and M12(AI-4) adhesin islands and portions of the adhesin islands inserted inthe pAM401 construct for expression in L. lactis.

FIG. 173 A-C: Western blot analysis showing assembly of GAS pili in L.lactis expressing GAS AI-2 (M1) (A), GAS AI-4 (M12) (B), and GASAI-1(M6) (C).

FIG. 174: FACS analysis of GAS serotype M6 for M6_Spy0157 surfaceexpression.

FIG. 175: FACS analysis of GAS serotype M12 for 19224139 surfaceexpression.

FIG. 176 A-E: Immunogold electron microscopy using antibodies againstM6_Spy0160 detects pili on the surface of M6 strain 2724.

FIG. 176 F: Immunogold electron microscopy using antibodies againstM6_Spy0159 detects M6_Spy0159 surface expression on M6 strain 2724.

FIG. 177 A-C: Western blot analysis of M1 strain SF370 GAS bacteriaindividually deleted for M1_(—)130, SrtC1, or M1_(—)128 usinganti-M1_(—)130 serum (A), anti-M1_(—)128 serum (B), and anti-M1_(—)126serum (C).

FIG. 178 A-C: Immunogold electron microscopy using antibodies againstM1_(—)128 to detect surface expression on wildtype strain SF370 bacteria(A), M1_(—)128 deleted SF370 bacteria (B), and SrtC1 deleted SF370bacteria (C).

FIG. 179 A-C: FACS analysis to detect expression of M1_(—)1126 (A),M1_(—)128 (B), and M1_(—)130 (C) on the surface of wildtype SF370 GASbacteria.

FIG. 179 D-F: FACS analysis to detect expression of M1_(—)126 (D),M1_(—)128 (E), and M1_(—)130 (F) on the surface of M1_(—)128 deletedSF370 GAS bacteria.

FIG. 179 G-I: FACS analysis to detect expression of M1_(—)126 (G),M1_(—)128 (H), and M1_(—)130 (I) on the surface of SrtC1 deleted SF370GAS bacteria.

FIG. 180 A and B: FACS analysis of wildtype (A) and LepA deletion mutant(B) strains of SF370 bacteria for M1 surface expression.

FIG. 181: Western blot analysis detects high molecular weight polymersin S. pneumoniae TIGR4 using anti-RrgB antisera.

FIG. 182: Detection of high molecular weight polymers in S. pnuemoniaerlrA positive strains.

FIG. 183: Detection of high molecular weight polymers in S. pneumoniaeTIGR4 by silver staining and Western blot analysis using anti-RrgBantisera.

FIG. 184: Deletion of S. pneumoniae TIGR4 adhesin island sequencesinterferes with the ability of S. pneumoniae to adhere to A549 alveolarcells.

FIG. 185: Negative staining of S. pneumoniae strain TIGR4 showingabundant pili on the bacterial surface.

FIG. 186: Negative staining of strain TIGR4 deleted for rrgA-srtDadhesin island sequences showing no pili on the bacterial surface

FIG. 187: Negative staining of the TIGR4 mgrA mutant showing abundantpili on the bacterial surface.

FIG. 188: Negative staining of the negative control TIGR4 mgrA mutantdeleted for adhesin island sequences rrgA-srtD showing no pili on thebacterial surface.

FIG. 189: Immuno-gold labelling of S. pneumoniae strain TIGR4 grown onblood agar solid medium using α-RrgB (5 nm) and α-RrgC (10 nm). Barrepresents 200 nm.

FIG. 190 A and B: Detection of expression and purification of S.pneumoniae RrgA protein by SDS-PAGE (A) and Western blot analysis (B).

FIG. 191: Detection of RrgB by antibodies produced in mice.

FIG. 192: Detection of RrgC by antibodies produced in mice.

FIG. 193: Purification of S. pneumoniae TIGR 4 pili by a cultivation anddigestion method and detection of the purified TIGR4 pili.

FIG. 194: Purification of S. pneumoniae TIGR 4 pili by a sucrosegradient centrifugation method and detection of the purified TIGR4 pili.

FIG. 195: Purification of S. pneumoniae TIGR 4 pili by a gel filtrationmethod and detection of the purified TIGR4 pili.

FIG. 196: Alignment of full length S. pneumoniae adhesin islandsequences from ten S. pneumoniae strains.

FIG. 197 A: Schematic of GBS AI-1 coding sequences.

FIG. 197 B: Nucleotide sequence of intergenic region between AraC andGBS 80 (SEQ ID NO: 273.

FIG. 197 C: FACS analysis results for GBS 80 expression in GBS strainshaving different length polyA tracts in the intergenic region betweenAraC and GBS 80.

FIG. 198: Table comparing the percent identity of surface proteinsencoded by a serotype M6 (harbouring a GAS AI-1) adhesin island relativeto other GAS serotypes harbouring an adhesin island.

FIG. 199: Table comparing the percent identity of surface proteinsencoded by a serotype M1 (harbouring a GAS AI-2) adhesin island relativeto other GAS serotypes harbouring an adhesin island.

FIG. 200: Table comparing the percent identity of surface proteinsencoded by serotypes M3, M18, M5, and M49 (harbouring GAS AI-3) adhesinislands relative to other GAS serotypes harbouring an adhesin island.

FIG. 201: Table comparing the percent identity of surface proteinsencoded by a serotype M12 (harbouring a GAS AI-1) adhesinisland-relative to other GAS serotypes harbouring an adhesin island.

FIG. 202: GBS 80 recombinant protein does not bind to epithelial cells.

FIG. 203: Deletion of GBS 80 protein does not affect the ability of GBSto adhere and invade ME180 cervical epithelial cells.

FIG. 204: GBS 80 binds to extracellular matrix proteins.

FIG. 205: Deletion of GBS 104 protein, but not GBS 80, reduces thecapacity of GBS to invade J774 macrophage-like cells

FIG. 206: GBS 104 knockout mutant strains of bacteria translocatethrough an epithelial monolayer less efficiently that the isogenic wildtype strain.

FIG. 207: GBS 80 knockout mutant strains of bacteria partially lose theability to translocate through an epithelial monolayer.

FIG. 208: GBS adherence to HUVEC endothelial cells.

FIG. 209: Strain growth rate of wildtype, GBS 80-deleted, or GBS 104deleted COH1 GBS.

FIG. 210: Binding of recombinant GBS 104 protein to epithelial cells byFACS analysis.

FIG. 211: Deletion of GBS 104 protein in the GBS strain COH1 reduces theability of GBS to adhere to ME180 cervical epithelial cells.

FIG. 212: COH1 strain GBS overexpressing GBS 80 protein has an impairedcapacity to translocate through an epithelial monolayer.

FIG. 213: Scanning electron microscopy shows that overexpression of GBS80 protein on COH1 strain GBS enhances the capacity of the COH1 bacteriato form microcolonies on epithelial cells.

FIG. 214: Confocal imaging shows that overexpression of GBS 80 proteinson COH1 strain GBS enhances the capacity of the COH1 bacteria to formmicrocolonies on epithelial cells.

FIG. 215: Detection of GBS 59 on the surface of GBS strain 515 byimmuno-electron microscopy.

FIG. 216: Detection of GBS 67 on the surface of GBS strain 515 byimmuno-electron microscopy.

FIG. 217: GBS 67 binds to fibronectin.

FIG. 218: Western blot analysis shows that deletion of both GBS AI-2sortase genes abolishes assembly of the pilus.

FIG. 219: FACS analysis shows that deletion of both GBS AI-2 sortasegenes abolishes assembly of the pilus.

FIG. 220 A-C: Western blot analysis shows that GBS 59, GBS 67, and GBS150 form high molecular weight complexes.

FIG. 221 A-C: Western blot analysis shows that GBS 59 is required forpolymer formation of GBS 67 and GBS 150.

FIG. 222: FACS analysis shows that GBS 59 is required for surfaceexposure of GBS 67.

FIG. 223: Summary Western blots for detection of GBS 59, GBS 67, or GBS150 in GBS 515 and GBS 515 mutant strain.

FIG. 224: Description of GBS 59 Allelic variants.

FIG. 225: GBS 59 is opsonic only against a strain of GBS expressing ahomologous GBS 59.

FIG. 226 A and B: Results of FACS analysis for surface expression of GBS59 using antibodies specific for different GBS 59 isoforms.

FIG. 227 A and B: Results of FACS analysis for surface expression of GBS80, GBS 104, GBS 322, GBS 67, and GBS 59 on 41 various strains of GBSbacteria.

FIG. 228: Results of FACS analysis for surface expression of GBS 80, GBS104, GBS 322, GBS 67, and GBS 59 on 41 strains of GBS bacteria obtainedfrom the CDC.

FIG. 229: Expected immunogenicity coverage of different combinations ofGBS 80, GBS 104, GBS 322, GBS 67, and GBS 59 across strains of GBSbacteria.

FIG. 230: GBS 59 opsonophagocytic activity is comparable to that of amixture of GBS 80, GBS 104, GBS 322 and GBS 67.

FIG. 231 A-C: Schematic presentation of example hybrid GBS AIs.

FIG. 232: Schematic presentation of an example hybrid GBS AI.

FIG. 233 A and B: Western blot and FACS analysis detect expression ofGBS 80 and GBS 67 on the surface of L. lactis transformed with a hybridGBS AI.

FIG. 234 A-E Hybrid GBS AI cloning strategy.

FIG. 235: High magnification of S. pneumoniae strain TIGR4 pili doublelabeled with α-RrgB (5 nm) and α-RrgC (10 nm). Bar represents 100 nm.

FIG. 236: Immuno-gold labeling of the S. pneumoniae TIGR4 rrgA-srtDdeletion mutant with no visible pili on the surface detectable byα-RrgB- and α-RrgC. Bar represents 200 nm.

FIG. 237: Variability in GBS 67 amino acid sequences between strains2603 and H36B.

FIG. 238: Strain variability in GBS 67 amino acid sequences of allele I(2603).

FIG. 239: Stran variability in GBS 67 amino acid sequence of allele II(H36B).

BRIEF DESCRIPTION OF THE TABLES

TABLE 1: Active Maternal Immunization Assay for fragments of GBS 80

TABLE 2: Passive Maternal Immunization Assay for fragments of GBS 80

TABLE 3: Lethal dose 50% of AI-1 mutants from GBS strain isolate 2603.

TABLE 4: GAS AI-1 sequences from M6 isolate (MGAS10394).

TABLE 5: GAS AI-2 sequences from M1 isolate (SF370).

TABLE 6: GAS AI-3 sequences from M3 isolate (MGAS315).

TABLE 7: GAS AI-3 sequences from M3 isolate (SSI-1).

TABLE 8: GAS AI-3 sequences from M18 isolate (MGAS8232).

TABLE 9: S. pneumoniae AI sequences from TIGR4 sequence.

TABLE 10: GAS AI-3 sequences from M5 isolate (Manfredo).

TABLE 11: GAS AI-4 sequences from M12 isolate (A735).

TABLE 12: Conservation of GBS 80 and GBS 104 amino acid sequences.

TABLE 13: Conservation of GBS 322 and GBS 276 amino acid sequences.

TABLE 14: Active maternal immunization assay for a combination offragments from GBS 322, GBS 80, GBS 104, and GBS 67.

TABLE 15: Antigen surface exposure of GBS 80, GBS 322, GBS 104, and GBS67.

TABLE 16: Active maternal immunization assay for each of GBS 80 and GBS322 antigens.

TABLE 17: Active maternal immunization assay for GBS 59.

TABLE 18: Summary of FACS values for surface expression ofspyM6_(—)0159.

TABLE 19: Summary of FACS values for surface expression ofspyM6_(—)0160.

TABLE 20: Summary of FACS values for surface expression of GAS 15.

TABLE 21: Summary of FACS values for surface expression of GAS 16.

TABLE 22: Summary of FACS values for surface expression of GAS 16 usinga second antisera.

TABLE 23: Summary of FACS values for surface expression of GAS 18.

TABLE 24: Summary of FACS values for surface expression of GAS 18 usinga second antisera.

TABLE 25: Summary of FACS values for surface expression ofSpyM3_(—)0098.

TABLE 26: Summary of FACS values for surface expression ofSpyM3_(—)0100.

TABLE 27: Summary of FACS values for surface expression of SpyM3_(—)0102in M3 serotypes.

TABLE 28: Summary of FACS values for surface expression of SpyM3_(—)0102in M6 serotypes.

TABLE 29: Summary of FACS values for surface expression of SpyM3_(—)0104in M3 serotypes.

TABLE 30: Summary of FACS values for surface expression of SpyM3_(—)0104in an M12 serotype.

TABLE 31: Summary of FACS values for surface expression of SPs_(—)0106in M3 serotypes.

TABLE 32: Summary of FACS values for surface expression of SPs_(—)0106in an M12 serotype.

TABLE 33: Summary of FACS values for surface expression of 19224134 inan M12 serotype.

TABLE 34: Summary of FACS values for surface expression of 19224134 inM6 serotypes.

TABLE 35: Summary of FACS values for surface expression of 19224135 inan M12 serotype.

TABLE 36: Summary of FACS values for surface expression of 19224137 inan M12 serotype.

TABLE 37: Summary of FACS values for surface expression of 19224141 inan M12 serotype.

TABLE 38: S. pneumoniae strain 670 μl sequences.

TABLE 39: Pecent identity comparison of S. pneumoniae strains AIsequences.

TABLE 40: FACS analysis of L. lactis and GBS bacteria strains expressingGBS AI-1.

TABLE 41: Sequences of primers used to amplify AI locus.

TABLE 42: Conservation of amino acid sequences encoded by the S.pneumoniae AI locus.

TABLE 43: Protection of Mice Immunized with L. lactis expressing GBSAI-1.

TABLE 44: GAS AI-3 sequences from M49 isolate (591).

TABLE 45: Comparison of Sequences Between the Four GAS AIs.

TABLE 46: Antibody Responses against GBS 80 in Serum of Mice Immunizedwith L. lactis Expressing GBS AI-1

TABLE 47: Anti-GBS 80 IgA Antibodies Detected in Mouse Tissues FollowingImmunization with L. lactis Expressing GBS AI-1

TABLE 48: GBS 67 Protects Mice in an Immunization Assay

TABLE 49: Exposure Levels of GBS 80, GBS 104, GBS 67, GBS 322, and GBS59 on GBS Strains

TABLE 50: High Levels of Surface Protein Expression on GBS Serotypes

TABLE 51: Further Protection of Mice Immunized with L. lactis expressingGBS AI-1

DETAILED DESCRIPTION OF THE INVENTION

The practice of the present invention will employ, unless otherwiseindicated, conventional methods of chemistry, biochemistry, molecularbiology, immunology and pharmacology, within the skill of the art. Suchtechniques are explained fully in the literature. See, e.g., Remington'sPharmaceutical Sciences, Mack Publishing Company, Easton, Pa., 19thEdition (1995); Methods In Enzymology (S. Colowick and N. Kaplan, eds.,Academic Press, Inc.); and Handbook of Experimental Immunology, Vols.I-IV (D. M. Weir and C. C. Blackwell, eds., 1986, Blackwell ScientificPublications); Sambrook, et al., Molecular Cloning: A Laboratory Manual(2nd Edition, 1989); Handbook of Surface and Colloidal Chemistry (Birdi,K. S. ed., CRC Press, 1997); Short Protocols in Molecular Biology, 4thed. (Ausubel et al. eds., 1999, John Wiley & Sons); Molecular BiologyTechniques: An Intensive Laboratory Course, (Ream et al., eds., 1998,Academic Press); PCR (Introduction to Biotechniques Series), 2nd ed.(Newton & Graham eds., 1997, Springer Verlag); Peters and Dalrymple,Fields Virology (2d ed), Fields et al. (eds.), B. N. Raven Press, NewYork, N.Y.

All publications, patents and patent applications cited herein, arehereby incorporated by reference in their entireties.

As used herein, an “Adhesin Island” or “AI” refers to a series of openreading frames within a bacterial genome, such as the genome for Group Aor Group B Streptococcus or other gram positive bacteria, that encodesfor a collection of surface proteins and sortases. An Adhesin Island mayencode for amino acid sequences comprising at least one surface protein.The Adhesin Island may encode at least one surface protein.Alternatively, an Adhesin Island may encode for at least two surfaceproteins and at least one sortase. Preferably, an Adhesin Island encodesfor at least three surface proteins and at least two sortases. One ormore of the surface proteins may include an LPXTG motif (such as LPXTG(SEQ ID NO: 122)) or other sortase substrate motif. One or more AIsurface proteins may participate in the formation of a pilus structureon the surface of the gram positive bacteria.

Adhesin Islands of the invention preferably include a divergentlytranscribed transcriptional regulator (i.e., the transcriptionalregulator is located near or adjacent to the AI protein open readingframes, but it transcribed in the opposite direction). Thetranscriptional regulator may regulate the expression of the AI operon.

GBS Adhesin Island 1

As discussed above, Applicants have identified a new adhesin island,“Adhesin Island 1”, “AI-1”, or “GBS AI-1”, within the genomes of severalGroup B Streptococcus serotypes and isolates. AI-1 comprises a series ofapproximately five open reading frames encoding for a collection ofamino acid sequences comprising surface proteins and sortases (“AI-1proteins”). Specifically, AI-1 includes open reading frames encoding fortwo or more (i.e., 2, 3, 4 or 5) of GBS 80, GBS 104, GBS 52, SAG0647 andSAG0648. One or more of the AI-1 open reading frame polynucleotidesequences may be replaced by a polynucleotide sequence coding for afragment of the replaced ORF. Alternatively, one or more of the AI-1open reading frames may be replaced by a sequence having sequencehomology to the replaced ORF.

A schematic of AI-1 is presented in FIG. 1. AI-1 typically resides on anapproximately 16.1 kb transposon-like element frequently inserted intothe open reading frame for trmA. One or more of the AI-1 surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) motif or other sortase substrate motif. The AI surface proteins ofthe invention may affect the ability of the GBS bacteria to adhere toand invade epithelial cells. AI surface proteins may also affect theability of GBS to translocate through an epithelial cell layer.Preferably, one or more AI surface proteins are capable of binding to orotherwise associating with an epithelial cell surface. AI surfaceproteins may also be able to bind to or associate with fibrinogen,fibronectin, or collagen.

The AI-1 sortase proteins are predicted to be involved in the secretionand anchoring of the LPXTG containing surface proteins. AI-1 may encodefor at least one surface protein. Alternatively, AI-1 may encode for atleast two surface exposed proteins and at least one sortase. Preferably,AI-1 encodes for at least three surface exposed proteins and at leasttwo sortases. The AI-1 protein preferably includes GBS 80 or a fragmentthereof or a sequence having sequence identity thereto.

As used herein, an LPXTG motif represents an amino acid sequencecomprising at least five amino acid residues. Preferably, the motifincludes a leucine (L) in the first amino acid position, a proline (P)in the second amino acid position, a threonine (T) in the fourth aminoacid position and a glycine (G) in the fifth amino acid position. Thethird position, represented by X, may be occupied by any amino acidresidue. Preferably, the X is occupied by lysine (K), Glutamate (E),Asparagine (N), Glutamine (Q) or Alanine (A). Preferably, the X positionis occupied by lysine (K). In some embodiments, one of the assignedLPXTG amino acid positions is replaced with another amino acid.Preferably, such replacements comprise conservative amino acidreplacements, meaning that the replaced amino acid residue has similarphysiological properties to the removed amino acid residue. Geneticallyencoded amino acids may be divided into four families based onphysiological properties: (1) acidic (asparatate and glutamate), (2)basic (lysine, arginine, histitidine), (3) non-polar (alanine, valine,leucine, isoleucine, proline, phenylalanine, methionine, tryptophane)and (4) uncharged polar (glycine, asparagines, glutamine, cysteine,serine, threonine, and tyrosine). Phenylalanine, tryptophan and tyrosineare sometimes classified jointly as aromatic amino acids. For example,it is reasonably predictable that an isolated replacement of a leucinewith an isoleucine or valine, an asparate with a glutamate, a threoninewith a serine, or a similar conservative replacement of an amino acidwith a structurally related amino acid will not have a major effect onthe biological activity.

The first amino acid position of the LPXTG motif may be replaced withanother amino acid residue. Preferably, the first amino acid residue(leucine) is replaced with an alanine (A), valine (V), isoleucine (I),proline (P), phenylalanine (F), methionine (M), glutamic acid (E),glutamine (Q), or tryptophan (Y) residue. In one preferred embodiment,the first amino acid residue is replaced with an isoleucine (I).

The second amino acid residue of the LPXTG motif may be replaced withanother amino acid residue. Preferably, the second amino acid residuepraline (P) is replaced with a valine (V) residue.

The fourth amino acid residue of the LPXTG motif may be replaced withanother amino acid residue. Preferably, the fourth amino acid residue(threonine) is replaced with a serine (S) or an alanine (A).

In general, an LPXTG motif may be represented by the amino acid sequenceXXXXG, in which X at amino acid position 1 is an L, a V, an E, an I, anF, or a Q; X at amino acid position 2 is a P if X at amino acid position1 is an L, an I, or an F; X at amino acid position 2 is a V if X atamino acid position 1 is a E or a Q; X at amino acid position 2 is a Vor a P if X at amino acid position 1 is a V; X at amino acid position 3is any amino acid residue; X at amino acid position 4 is a T if X atamino acid position 1 is a V, E, I, F, or Q; and X at amino acidposition 4 is a T, S, or A if X at amino acid position 1 is an L.

Generally, the LPXTG motif of a GBS AI protein may be represented by theamino acid sequence XPXTG, in which X at amino acid position 1 is L, I,or F, and X at amino acid position 3 is any amino acid residue. Specificexamples of LPXTG motifs in GBS AI proteins may include LPXTG (SEQ IDNO: 122) or IPXTG (SEQ ID NO: 133).

As discussed further below, the threonine in the fourth amino acidposition of the LPXTG motif may be involved in the formation of a bondbetween the LPXTG containing protein and a cell wall precursor.Accordingly, in preferred LPXTG motifs, the threonine in the fourthamino acid position is not replaced with another amino acid or, if thethreonine is replaced, the replacement amino acid is preferably aconservative amino acid replacement, such as serine.

Instead of an LPXTG motif, the AI surface proteins of the invention maycontain alternative sortase substrate motifs such as NPQTN (SEQ ID NO:142), NPKTN (SEQ ID NO: 168), NPQTG (SEQ ID NO: 169), NPKTG (SEQ ID NO:170), XPXTGG (SEQ ID NO: 143), LPXTAX (SEQ ID NO: 144), or LAXTGX (SEQID NO: 145). (Similar conservative amino acid substitutions can also bemade to these membrane motifs).

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

The AI surface proteins may be polymerized into pili bysortase-catalysed transpeptidation. (See FIG. 44.) Cleavage of AIsurface proteins by sortase between the threonine and glycine residuesof an LPXTG motif yields a thioester-linked acyl intermediate ofsortase. Many AI surface proteins include a pilin motif amino acidsequence which interacts with the sortase and LPXTG amino acid sequence.The first lysine residue in a pilin motif can serve as an amino groupacceptor of the cleaved LPXTG motif and thereby provide a covalentlinkage between AI subunits to form pili. For example, the pilin motifcan make a nucleophilic attack on the acyl enzyme providing a covalentlinkage between AI subunits to form pili and regenerate the sortaseenzyme. Examples of pilin motifs may include ((YPKN(X₁₀)K; SEQ ID NO:146), (YPKN(X₉)K; SEQ ID NO: 147), (YPK(X₇)K; SEQ ID NO: 148),(YPK(X₁₁)K; SEQ ID NO: 149), or (PKN(X₉)K; SEQ ID NO: 150)). Preferably,the AI surface proteins of the invention include a pilin motif aminoacid sequence.

Typically, AI surface proteins of the invention will contain anN-terminal leader or secretion signal to facilitate translocation of thesurface protein across the bacterial membrane.

Group B Streptococci are known to colonize the urinary tract, the lowergastrointestinal tract and the upper respiratory tract in humans.Electron micrograph images of GBS infection of a cervical epithelialcell line (ME180) are presented in FIG. 25. As shown in these images,the bacteria closely associate with tight junctions between the cellsand appear to cross the monolayer by a paracellular route. Similarparacellular invasion of ME180 cells is also shown in the contrastimages in FIG. 26. The AI surface proteins of the invention may effectthe ability of the GBS bacteria to adhere to and invade epithelialcells. AI surface proteins may also affect the ability of GBS totranslocate through an epithelial cell layer. Preferably, one or more AIsurface proteins are capable of binding to or otherwise associating withan epithelial cell surface.

Applicants have discovered that AI-1 surface protein GBS 104 can bindepithelial cells such as ME180 human cervical cells, A549 human lungcells and Caco2 human intestinal cells (See FIGS. 29 and 210). Further,deletion of the GBS 104 sequence in a GBS strain reduces the capacity ofGBS to adhere to ME180 cervical epithelial cells. (See FIGS. 30 and211). Deletion of GBS 104 also reduces the capacity of GBS to invadeJ774 macrophage-like cells. (See FIGS. 32 and 205). Deletion of GBS 104also causes GBS to translocate through epithelial monolayers lessefficiently. See FIG. 206. GBS 104 protein therefore appears to bind toME180 epithelial cells and to have a role in adhesion to epithelialcells and macrophage cell lines.

Similar to the GBS bacteria that are deletion mutants for GBS 104, GBS80 knockout mutant strains also partially lose the ability totranslocate through an epithelial monolayer. See FIG. 207. Deletion ofeither GBS 80 or GBS 104 in COH1 cells diminishes adherence to HUVECendothelial cells. See FIG. 208. Deletion of GBS 80 or GBS 104 in COH1does not, however, affect growth of COH1 either with ME180 cells or inincubation medium (IM). See FIG. 209. Both GBS 80 and GBS 104,therefore, appear to be involved in translocation of GBS throughepithelial cells.

GBS 80 does not appear to bind to epithelial cells. Incubation ofepithelial cells in the presence of GBS 80 protein followed by FACSanalysis using an anti-GBS 80 polyclonal antibody did not detect GBS 80binding to the epithelial cells. See FIG. 202. Furthermore, deletion ofGBS 80 protein does not affect the ability of GBS to adhere and invadeME180 cervical epithelial cells. See FIG. 203

Preferably, one or more of the surface proteins may bind to one or moreextracellular matrix (ECM) binding proteins, such as fibrinogen,fibronectin, or collagen. As shown in FIGS. 5 and 204, and Example 1,GBS 80, one of the AI-1 surface proteins, can bind to the extracellularmatrix binding proteins fibronectin and fibrinogen. While GBS 80 proteinapparently does not bind to certain epithelial cells or affect thecapacity of a GBS bacteria to adhere to or invade cervical epithelialcells (See FIGS. 27 and 28), removal of GBS 80 from a wild type straindecreases the ability of that strain to translocate through anepithelial cell layer (see FIG. 31).

GBS 80 may also be involved in formation of biofilms. COH1 bacteriaoverexpressing GBS 80 protein have an impaired ability to translocatethrough an epithelial monolayer. See FIG. 212. These COH1 bacteriaoverexpressing GBS 80 form microcolonies on epithelial cells. See FIGS.213 and 214. These microcolonies may be the initiation of biofilmdevelopment.

AI Surface proteins may also demonstrate functional homology topreviously identified adhesion proteins or extracellular matrix (ECM)binding proteins. For example, GBS 80, a surface protein in AI-1,exhibits some functional homology to FimA, a major fimbrial subunit of aGram positive bacteria A. naeslundii. FimA is thought to be involved inbinding salivary proteins and may be a component in a fimbrae on thesurface of A. naeslundii. See Yeung et al. (1997) Infection & Immunity65:2629-2639; Yeunge et al (1998) J. Bacteriol 66:1482-1491; Yeung etal. (1988) J. Bacteriol 170:3803-3809; and Li et al. (2001) Infection &Immunity 69:7224-7233.

A similar functional homology has also been identified between GBS 80and proteins involved in pili formation in the Gram positive bacteriaCorynebacterium diphtheriae (SpaA, SpaD, and SpaH). See, Ton-That et al.(2003) Molecular Microbiology 50(4):1429-1438 and Ton-That et al. (2004)Molecular Microbiology 53(1):251-261. The C. diphtheriae proteins allincluded a pilin motif of WxxxVxVYPK (SEQ ID NO: 151; where x indicatesa varying amino acid residue). The lysine (K) residue is particularlyconserved in the C. diphtheriae pilus proteins and is thought to beinvolved in sortase catalized oligomerization of the subunits involvedin the C. diphtheriae pilus structure. (The C. diphtheriae pilin subunitSpaA is thought to occur by sortase-catalyzed amide bond cross-linkingof adjacent pilin subunits. As the thioester-linked acyl intermediate ofsortase requires nucleophilic attack for release, the conserved lysinewithin the SpaA pilin motif might function as an amino group acceptor ofcleaved sorting signals, thereby providing for covalent linkages of theC. diphtheria pilin subunits. See FIG. 6(d) of Ton-That et al.,Molecular Microbiology (2003) 50(4): 1429-1438.)

In addition, an “E box” comprising a conserved glutamic acid residue hasalso been identified in the C. diphtheria pilin associated proteins asimportant in C. diphtheria pilin assembly. The E box motif generallycomprises YxLxETxAPxGY (SEQ ID NO: 152; where x indicates a varyingamino acid residue). In particular, the conserved glutamic acid residuewithin the E box is thought necessary for C. diphtheria pilus formation.

Preferably, the AI-1 polypeptides of the immunogenic compositionscomprise an E box motif. Some examples of E box motifs in the AI-1polypeptides may include the amino acid sequences YxLxExxxxxGY (SEQ IDNO: 153), YxLxExxxPxGY (SEQ ID NO: 154), or YxLxETxAPxGY (SEQ ID NO:152). Specifically, the E box motif of the polypeptides may comprise theamino acid sequences YKLKETKAPEGY (SEQ ID NO: 155), YVLKEIETQSGY (SEQ IDNO: 156), or YKLYEISSPDGY (SEQ ID NO: 157).

As discussed in more detail below, a pilin motif containing a conservedlysine residue and an E box motif containing a conserved glutamic acidresidue have both been identified in GBS 80.

While previous publications have speculated that pilus-like structuresmight be formed on the surface of streptococci, (see, e.g., Ton-That etal., Molecular Microbiology (2003) 50(4): 1429-1438), these structureshave not been previously visible in negative stain (non-specific)electron micrographs, throwing such speculations into doubt. Forexample, FIG. 34 presents electron micrographs of GBS serotype III,strain isolate COH1 with a plasmid insert to facilitate theoverexpression of GBS 80. This EM photo was produced with a standardnegative stain—no pilus structures are distinguishable. In addition, theuse of such AI surface proteins in immunogenic compositions for thetreatment or prevention of infection against a Gram positive bacteriahas not been previously described.

Surprisingly, Applicants have now identified the presence of GBS 80 insurface exposed pilus formations visible in electron micrographs. Thesestructures are only visible when the electron micrographs arespecifically stained against an AI surface protein such as GBS 80.Examples of these electron micrographs are shown in FIGS. 11, 16 and 17,which reveal the presence of pilus structures in wild type COH1Streptococcus agalactiae. Other examples of these electron micrographsare shown in FIG. 49, which reveals that GBS 80 is associated with piliin a wild type clinical isolate of S. agalactiae, JM9030013. (See FIG.49.)

Applicants have also constructed mutant GBS strains containing a plasmidcomprising the GBS 80 sequence resulting in the overexpression of GBS 80within this mutant. The electron micrographs of FIGS. 13-15 are alsostained against GBS 80 and reveal long, oligomeric structures containingGBS 80 which appear to cover portions of the surface of the bacteria andstretch far out into the supernatant.

In some instances, the formation of pili structures on GBS appears to becorrelated to surface expression of GBS 80. FIG. 61 provides FACanalysis of GBS 80 surface levels on bacterial strains COH1 andJM9130013 using an anti-GBS 80 antisera. Immunogold electron microscopyof the COH1 and JM9130013 bacteria using anti-GBS 80 antiserademonstrates that JM9130013 bacteria, which have higher values for GBS80 surface expression, also form longer pili structures.

The surface exposure of GBS 80 on GBS is generally notcapsule-dependent. FIG. 62 provides FACS analysis of capsulated anduncapsulated GBS analyzed with anti-GBS 80 and anti-GBS 322 antibodies.Surface exposure of GBS 80, unlike GBS 322, is not capsule dependent.

An Adhesin Island surface protein, such as GBS 80 appears to be requiredfor pili formation, as well as an Adhesin Island sortase. Pili areformed in Coh1 bacterial clones that overexpress GBS 80, but lack GBS104, or one of the AI-1 sortases sag0647 or sag0648. However, pili arenot formed in Coh1 bacterial clones that overexpress GBS 80 and lackboth sag0647 and sag0648. Thus, for example, it appears that at leastGBS 80 and a sortase, sag0647 or sag0648, may be necessary for piliformation. (See FIG. 48.) Overexpression of GBS 80 in GBS strain 515,which lacks an AI-1, also assembles GBS 80 into pili. GBS strain 515contains an AI-2, and thus AI-2 sortases. The AI-2 sortases in GBSstrain 515 apparently polymerize GBS 80 into pili. (See FIG. 42.)Overexpression of GBS 80 in GBS strain 515 cell knocked out for GBS 67expression also apparently polymerizes GBS 80 into pili. (See FIG. 72.)

While GBS 80 appears to be required for GBS AI-1 pili formation, GBS 104and sortase SAG0648 appears to be important for efficent AI-1 piliassembly. For example, high-molecular structures are not assembled inisogenic COH1 strains which lack expression of GBS 80 due to genedisruption and are less efficiently assembled in isogenic COH1 strainswhich lack the expression of GBS 104 (see FIG. 41). This GBS straincomprises high molecular weight pili structures composed of covalentlylinked GBS 80 and GBS 104 subunits. In addition, deleting SAG0648 inCOH1 bacteria interferes with assembly of some of the high molecularweight pili structures. Thus, indicating that SAG0648 plays a role inassembly of these pilin species. (See FIG. 41).

EM photos confirm the involvement of AI surface protein GBS 104 withinthe hyperoligomeric structures of a GBS strain adapted for increased GBS80 expression. (See FIGS. 34-41 and Example 6). In a wild type serotypeVIII GBS strain, strain JM9030013, IEM identifies GBS 104 as formingclusters on the bacterial surface. (See FIG. 50.)

GBS 52 also appears to be a component of the GBS pili. Immunoblots usingan anti-GBS 80 antisera on total cell extracts of Coh1 and a GBS 52 nullmutant Coh1 reveal a shift in detected proteins in the Coh1 wild typestrain relative to the GBS 52 null mutant Coh1 strain. The shiftedproteins were also detected in the wild type Coh1 bacteria with ananti-GBS 52 antisera, indicating that the GBS 52 may be present in thepilus. (See FIG. 45.)

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising an AI surface protein suchas GBS 80. The oligomeric, pilus-like structure may comprise numerousunits of AI surface protein. Preferably, the oligomeric, pilus-likestructures comprise two or more AI surface proteins. Still morepreferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineamino acid residue.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude one or both of a pilin motif comprising a conserved lysineresidue and an E box motif comprising a conserved glutamic acid residue.

More than one AI surface protein may be present in the oligomeric,pilus-like structures of the invention. For example, GBS 80 and GBS 104may be incorporated into an oligomeric structure. Alternatively, GBS 80and GBS 52 may be incorporated into an oligomeric structure, or GBS 80,GBS 104 and GBS 52 may be incorporated into an oligomeric structure.

In another embodiment, the invention includes compositions comprisingtwo or more AI surface proteins. The composition may include surfaceproteins from the same adhesin island. For example, the composition mayinclude two or more GBS AI-1 surface proteins, such as GBS 80, GBS 104and GBS 52. The surface proteins may be isolated from Gram positvebacteria or they may be produced recombinantly.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a GBS Adhesin Island protein in oligomeric form, preferably ina hyperoligomeric form. In one embodiment, the invention comprises acomposition comprising one or more GBS Adhesin Island 1 (“AI-1”)proteins and one or more GBS Adhesin Island 2 (“AI-2”) proteins, whereinone or more of the Adhesin Island proteins is in the form of anoligomer, preferably in a hyperoligomeric form.

The oligomeric, pilus-like structures of the invention may be combinedwith one or more additional GBS proteins. In one embodiment, theoligomeric, pilus-like structures comprise one or more AI surfaceproteins in combination with a second GBS protein. The second GBSprotein may be a known GBS antigen, such as GBS 322 (commonly referredto as “sip”) or GBS 276. Nucleotide and amino acid sequences of GBS 322sequenced from serotype V isolated strain 2603 V/R are set forth in WO02/35771 as SEQ ID 8539 and SEQ ID 8540 and in the present specificationas SEQ ID NOs: 38 and 39. A particularly preferred GBS 322 polypeptidelacks the N-terminal signal peptide, amino acid residues 1-24. Anexample of a preferred GBS 322 polypeptide is a 407 amino acid fragmentand is shown in SEQ ID NO: 40. Examples of preferred GBS 322polypeptides are further described in PCTUS04/______, attorney docketnumber PP20665.002 filed Sep. 15, 2004, hereby incorporated byreference, published as WO 2005/002619.

Additional GBS proteins which may be combined with the GBS AI surfaceproteins of the invention are also described in WO 2005/002619. TheseGBS proteins include GBS 91, GBS 184, GBS 305, GBS 330, GBS 338, GBS361, GBS 404, GBS 690, and GBS 691.

Additional GBS proteins which may be combined with the GBS AI surfaceproteins of the invention are described in WO 02/34771.

GBS polysaccharides which may be combined with the GBS AI surfaceproteins of the invention are described in WO 2004/041157. For example,the GBS AI surface proteins of the invention may be combined with a GBSpolysaccharides selected from the group consisting of serotype Ia, Ib,Ia/c, II, III, IV, V, VI, VII and VIII.

The oligomeric, pilus-like structures may be isolated or purified frombacterial cultures in which the bacteria express an AI surface protein.The invention therefore includes a method for manufacturing anoligomeric AI surface antigen comprising culturing a GBS bacterium thatexpresses the oligomeric AI protein and isolating the expressedoligomeric AI protein from the GBS bacteria. The AI protein may becollected from secretions into the supernatant or it may be purifiedfrom the bacterial surface. The method may further comprise purificationof the expressed AI protein. Preferably, the AI protein is in ahyperoligomeric form. Macromolecular structures associated witholigomeric pili are observed in the supernatant of cultured GBS strainCoh1. (See FIG. 46.) These pili are found in the supernatant at allgrowth phases of the cultured Coh1 bacteria. (See FIG. 47.)

The oligomeric, pilus-like structures may be isolated or purified frombacterial cultures overexpressing an AI surface protein. The inventiontherefore includes a method for manufacturing an oligomeric AdhesinIsland surface antigen comprising culturing a GBS bacterium adapted forincreased AI protein expression and isolation of the expressedoligomeric Adhesin Island protein from the GBS bacteria. The AI proteinmay be collected from secretions into the supernatant or it may bepurified from the bacterial surface. The method may further comprisepurification of the expressed Adhesin Island protein. Preferably, theAdhesin Island protein is in a hyperoligomeric form.

The GBS bacteria are preferably adapted to increase AI proteinexpression by at least two (e.g., 2, 3, 4, 5, 8, 10, 15, 20, 25, 30, 35,40, 45, 50, 60, 70, 80, 90, 100, 125, 150 or 200) times wild typeexpression levels.

GBS bacteria may be adapted to increase AI protein expression by anymeans known in the art, including methods of increasing gene dosage andmethods of gene upregulation. Such means include, for example,transformation of the GBS bacteria with a plasmid encoding the AIprotein. The plasmid may include a strong promoter or it may includemultiple copies of the sequence encoding the AI protein. Optionally, thesequence encoding the AI protein within the GBS bacterial genome may bedeleted. Alternatively, or in addition, the promoter regulating the GBSAdhesin Island may be modified to increase expression.

GBS bacteria harbouring a GBS AI-1 may also be adapted to increase AIprotein expression by altering the number adenosine nucleotides presentat two sites in the intergenic region between AraC and GBS 80. See FIG.197 A, which is a schematic showing the organization of GBS AI-1 andFIG. 197 B, which provides the sequence of the intergenic region betweenAraC and GBS 80 in the AI. The adenosine tracts which applicants haveidentified as influencing GBS 80 surface expression are at nucleotidepositions 187 and 233 of the sequence shown in FIG. 197 B (SEQ ID NO:273). Applicants determined the influence of these adenosine tracts onGBS 80 surface expression in strains of GBS bacteria harboring fouradenosines at position 187 and six adenosines at position 233, fiveadenosines at position 187 and six adenosines position 233, and fiveadenosines at position 187 and seven adenosines at position 233. FACSanalysis of these strains using anti GBS 80 antiserum determined that anintergenic region with five adenosines at position 187 and sixadenosines at position 233 had higher expression levels of GBS 80 ontheir surface than other stains. See FIG. 197 C for results obtainedfrom the FACS analysis. Therefore, manipulating the number of adenosinespresent at positions 187 and 233 of the AraC and GBS 80 intergenicregion may further be used to adapt GBS to increase AI proteinexpression.

The invention further includes GBS bacteria which have been adapted toproduce increased levels of AI surface protein. In particular, theinvention includes GBS bacteria which have been adapted to produceoligomeric or hyperoligomeric AI surface protein, such as GBS 80. In oneembodiment, the Gram positive bacteria of the invention are inactivatedor attenuated to permit in vivo delivery of the whole bacteria, with theAI surface protein exposed on its surface.

The invention further includes GBS bacteria which have been adapted tohave increased levels of expressed AI protein incorporated in pili ontheir surface. The GBS bacteria may be adapted to have increasedexposure of oligomeric or hyperoligomeric AI proteins on its surface byincreasing expression levels of a signal peptidase polypeptide.Increased levels of a local signal peptidase expression in Gram positivebacteria (such us LepA in GAS) are expected to result in increasedexposure of pili proteins on the surface of Gram positive bacteria.Increased expression of a leader peptidase in GBS may be achieved by anymeans known in the art, such as increasing gene dosage and methods ofgene upregulation. The GBS bacteria adapted to have increased levels ofleader peptidase may additionally be adapted to express increased levelsof at least one pili protein.

Alternatively, the AI proteins of the invention may be expressed on thesurface of a non-pathogenic Gram positive bacteria, such as Streptococusgordonii (See, e.g., Byrd et al., “Biological consequences of antigenand cytokine co-expression by recombinant Streptococcus gordonii vaccinevectors”, Vaccine (2002) 20:2197-2205) or Lactococcus lactis (See, e.g.,Mannam et al., “Mucosal Vaccine Made from Live, Recombinant Lactococcuslactis Protects Mice against Pharangeal Infection with Streptococcuspyogenes” Infection and Immunity (2004) 72(6):3444-3450). As usedherein, non-pathogenic Gram positive bacteria refer to Gram positivebacteria which are compatible with a human host subject and are notassociated with human pathogenisis. Preferably, the non-pathogenicbacteria are modified to express the AI surface protein in oligomeric,or hyper-oligomeric form. Sequences encoding for an AI surface proteinand, optionally, an AI sortase, may be integrated into thenon-pathogenic Gram positive bacterial genome or inserted into aplasmid. The non-pathogenic Gram positive bacteria may be inactivated orattenuated to facilitate in vivo delivery of the whole bacteria, withthe AI surface protein exposed on its surface. Alternatively, the AIsurface protein may be isolated or purified from a bacterial culture ofthe non-pathogenic Gram positive bacteria. For example, the AI surfaceprotein may be isolated from cell extracts or culture supernatants.Alternatively, the AI surface protein may be isolated or purified fromthe surface of the non-pathogenic Gram positive bacteria.

The non-pathogenic Gram positive bacteria may be used to express any ofthe Gram positive bacterial Adhesin Island proteins described herein,including proteins from a GBS Adhesin Island, a GAS Adhesin Island, or aS pneumo Adhesin Island. The non-pathogenic Gram positive bacteria aretransformed to express an Adhesin Island surface protein. Preferably,the non-pathogenic Gram positive bacteria also express at least oneAdhesin Island sortase. The AI transformed non-pathogenic Gram positivebacteria of the invention may be used to prevent or treat infection witha pathogenic Gram positive bacteria, such as GBS, GAS or Streptococcuspneumoniae. The non-pathogenic Gram positive bacteria may express theGram positive bacterial Adheshin Island proteins in oligomeric formsthat further comprise adhesin island proteins encoded within the genomeof the non-pathogenic Gram positive bacteria.

Applicants modified L. lactis to demonstrate that it can express GBS AIpolypeptides. L. lactis was transformed with a construct encoding GBS 80under its own promoter and terminator sequences. The transformed L.lactis appeared to express GBS 80 as shown by Western blot analysisusing anti-GBS 80 antiserum. See lanes 6 and 7 of the Western Blotsprovided in FIGS. 133A and 133B (133A and 133B are two differentexposures of the same Western blot). See also Example 13.

Applicants also transformed L. lactis with a construct encoding GBS AI-1polypeptides GBS 80, GBS 52, SAG0647, SAG0648, and GBS 104 under the GBS80 promoter and terminator sequences. These L. lactis expressed highmolecular weight structures that were immunoreactive with anti-GBS 80 inimmunoblots. See FIG. 134, lane 2, which shows detection of a GBS 80monomer and higher molecular weight polymers in total transformed L.lactis extracts. Thus, it appeared that L. lactis is capable ofexpressing GBS 80 in oligomeric form. The high molecular weight polymerswere not only detected in L. lactis extracts, but also in the culturesupernatants. See FIG. 135 at lane 4. See also Example 14. Thus, the GBSAI polypeptides in oligomeric form can be isolated and purified fromeither L. lactis cell extracts or culture supernatants. These oligomericforms can, for instance, be isolated from cell extracts or culturesupernatants by release by sonication. See FIGS. 136A and B. See alsoFIG. 171, which shows purification of GBS pili from whole extracts of L.lactis expressing the GBS AI-1 following sonication and gel filtrationon a Sephacryl HR 400 column.

Furthermore, the L. lactis transformed with the construct encoding GBSAI-1 polypeptides GBS 80, GBS 52, SAG0647, SAG0648, and GBS 104 underthe GBS 80 promoter and terminator sequences expressed the GBS AI-1polypeptides on its surface. FACS analysis of these transformed L.lactis detected cell surface expression of both GBS 80 and GBS 104. Thesurface expression levels of GBS 80 and GBS 104 on the transformed L.lactis were similar to the surface expression levels of GBS 80 and GBS104 on GBS strains COH1 and JM9130013, which naturally express GBS AI-1.

See FIG. 169 for FACS analysis data for L. lactis transformed with GBSAI-1 and wildtype JM9130013 bacteria using anti-GBS 80 and GBS 104antisera. Table 40 provides the results of FACS analysis of transformedL. lactis, COH1, and JM9130013 bacteria using anti-GBS 80 and anti-GBS104 antisera. The numbers provided represent the mean fluorescence valuedifference calculated for immune versus pre-immune sera obtained foreach bacterial strain. TABLE 40 FACS analysis of L. lactis and GBSbacteria strains expressing GBS AI-1 Anti-GBS 80 Anti-GBS 104 antiserumantiserum GBS AI-1 transformed 298 251 L. lactis GBS COH1 305 305 GBSJM9130013 461 355Immunogold-electronmicroscopy performed with anti-GBS 80 primaryantibodies detected the presence of pilus structures on the surface ofthe L. lactis bacteria expressing GBS AI-1, confirming the results ofthe FACS analysis. See FIG. 168 B and C. Interestingly, this expressionof GBS pili on the surface of the L. lactis induced L. lactisaggregation. See FIG. 170. Thus, GBS AI polypeptides may also beisolated and purified from the surface of L. lactis. The ability of L.lactis to express GBS AI polypeptides on its surface also demonstratesthat it may be useful as a host to deliver GBS AI antigens.

In fact, immunization of mice with L. lactis transformed with GBS AI-1was protective in a subsequent challenge with GBS. Female mice wereimmunized with L. lactis transformed with GBS AI-1. The immunized femalemice were bred and their pups were challenged with a dose of GBSsufficient to kill 90% of non-immunized pups. Detailed protocols forintranasal and subcutaneous immunization of mice with transformed L.lactis can be found in Examples 18 and 19, respectively. Table 43provides data showing that immunization of the female mice with L.lactis expressing GBS AI-1(LL-AI 1) greatly increased survival rate ofchallenged pups relative to both a negative PBS control (PBS) and anegative L. lactis control (LL 10 E9, which is wild type L. lactis nottransformed to express GBS AI-1). TABLE 43 Protection of Mice Immunizedwith L. lactis expressing GBS AI-1 Immunization Survival Route AntigenAlive/Treated Survival % % Range p value Intraperitoneum Recombinant GBS80 16/18 89 80-100 <0.001 Subcutaneous LL-AI 1 10 E9 40/49 82 70-90 <0.001 LL-AI 1 10 E10 50/60 83 60-100 <0.001 PBS  4/30 13 0-30 LL 10 E9 3/57 5 0-20 Intranasal LL-AI 1 10 E9 22/60 37  0-100 0.02 LL-AI 1 10E10 31/49 63 30-90  <0.001 LL 10 E9  2/27 7 0-20

Table 51 provides further evidence that immunization of mice with L.lactis transformed with GBS AI-1 is protective against GBS. TABLE 51Further Protection of Mice Immunized with L. lactis expressing GBS AI-1Immunization Alive/ Survival % Antigen route Treated (Pval < 0.0000001)Recombinant GBS 80 IP 48/50 92 Recombinant GBS 80 SC 21/30 70 L.lactis + AI1 10⁶ cfu SC  6/66 9 L. lactis + AI1 10⁷ cfu SC 47/70 73 L.lactis + AI1 10⁸ cfu SC 116/153 76 L. lactis + AI1 10⁹ cfu SC  98/118 83L. lactis + AI1 10¹⁰ cfu SC 107/129 83 L. lactis 10¹⁰ cfu SC  4/83 5 PBSSC  6/110 5 L. lactis + AI1 10¹⁰ cfu IN 51/97 52 L. lactis 10¹¹ cfu IN 1/40 7 PBS IN  0/37 0

Protection of immunized mice with L. lactis expressing the GBS AI-1 isat least partly due to a newly raised antibody response. Table 46provides anti-GBS 80 antibody titers detected in serum of the miceimmunized with L. lactis expressing the GBS AI-1 as described above.Mice immunized with L. lactis expressing the GBS AI-1 have anti-GBS 80antibody titres, which are not observed in mice immunized with L. lactisnot transformed to express the GBS AI-1. Further, as expected from thesurvival data, mice subcutaneously immunized with L. lactis transformedto express the GBS AI-1 have significantly higher serum anti-GBS 80antibody titers than mice intranasally immunized with L. lactistransformed to express the GBS AI-1. TABLE 46 Antibody Responses againstGBS 80 in Serum of Mice Immunized with L. lactis Expressing GBS AI-1 AbTitre Obtained Following Subcutaneous Intranasal Intraperitoneal AntigenImmunization Immunization Immunization LL 10 E9 0 0 LL-AI 1 10 E9 1400050 LL-AI 1 10 E10 25000 406 Recombinant GBS 80 120000

Anti-GBS 80 antibodies of the IgA isotype were specifically detected invarious body fluids of the mice subcutaneously or intranasally immunizedwith L. lactis expressing the GBS AI-1. TABLE 47 Anti-GBS 80 IgAAntibodies Detected in Mouse Tissues Following Immunization with L.lactis Expressing GBS AI-1 Anti-GBS 80 IgA Antibodies Detected inAntigen Immunization route Serum Vaginal Wash Nasal Wash LL 10 E9 0 0 0LL-AI 1 Subcutaneous 0 25 20 LL-AI 1 Intranasal 140 0 150 GBS 80Intraperitoneal 60 0

Furthermore, opsonophagocytosis assays also demonstrated that at leastsome of the antiserum produced against the L. lactis expressing GBS AI 1is opsonic for GBS. See FIG. 161.

To obtain protection of against GBS across a greater number of strainsand serotypes, it is possible to transform L. lactis with a recombinantGBS AI encoding both GBS AI-1 and AI-2, i.e., a hybrid GBS AI. By way ofexample, a hybrid GBS AI may be a GBS AI-1 with a replacement of the GBS104 gene with a GBS 67 gene. A schematic of such a hybrid GBS AI isdepicted in FIG. 231 A. A hybrid GBS AI may alternatively be a GBS AI-1with a replacement of the GBS 52 gene with a GBS 59 gene. See theschematic at FIG. 231 B. Alternatively, a hybrid GBS AI may be a GBSAI-1 with a substitution of a GBS 59 polypeptide for the GBS 52 gene anda substitution of the GBS 104 gene for genes encoding GBS 59 and the twoGBS AI-2 sortases. Another example of a hybrid GBS AI is a GBS AI-1 withthe substitution of a GBS 59 gene for the GBS 52 gene and a GBS 67 forthe GBS 104 gene. See the schematic at FIG. 232. A further example of ahybrid GBS AI is a GBS AI-1 having a GBS 59 gene and genes encoding theGBS AI-2 sortases in place of the GBS 52 gene. Yet another example of ahybrid GBS AI is a GBS AI-1 with a substitution of either GBS 52 or GBS104 with a fusion protein comprising GBS 322 and one of GBS 59, GBS 67,or GBS 150. Some of these hybrid GBS AIs may be prepared as brieflyoutlined in FIG. 234 A-F.

Applicants have prepared a hybrid GBS AI having a GBS AI-1 sequence witha substitution of a GBS 67 coding sequence for the GBS 104 gene asdepicted in FIG. 231 A. Transformation of L. lactis with the hybrid GBSAI-1 resulted in L. lactis expression of high molecular weight polymerscontaining the GBS 80 and GBS 67 proteins. See FIG. 233 A, whichprovides Western blot analysis of L. lactis transformed with the hybridGBS AI depicted in FIG. 231 A. When L. lactis transformed with thehybrid GBS AI were probed with antibodies to GBS 80 or GBS 67, highmolecular weight structures were detected. See lanes labelled LL+a) inboth the α-80 and α-67 immunoblots. The GBS 80 and GBS 67 proteins wereconfirmed to be present on the surface of L. lactis by FACS analysis.See FIG. 233 B, which shows a shift in fluorescence when GBS 80 and GBS67 antibodies are used to detect GBS 80 and GBS 67 surface expression.The same shifts in fluorescence were not observed in L. lactis controlcells, cells not transformed with the hybrid GBS AI.

Alternatively, the oligomeric, pilus-like structures may be producedrecombinantly. If produced in a recombinant host cell system, the AIsurface protein will preferably be expressed in coordination with theexpression of one or more of the AI sortases of the invention. Such AIsortases will facilitate oligomeric or hyperoligomeric formation of theAI surface protein subunits.

AI Sortases of the invention will typically have a signal peptidesequence within the first 70 amino acid residues. They may also includea transmembrane sequence within 50 amino acid residues of the Cterminus. The sortases may also include at least one basic amino acidresidue within the last 8 amino acids. Preferably, the sortases have oneor more active site residues, such as a catalytic cysteine andhistidine.

As shown in FIG. 1, AI-1 includes the surface exposed proteins of GBS80, GBS 52 and GBS 104 and the sortases SAG0647 and SAG0648. AI-1typically appears as an insertion into the 3′ end of the trmA gene.

In addition to the open reading frames encoding the AI-1 proteins, AI-1may also include a divergently transcribed transcriptional regulatorsuch as araC (i.e., the transcriptional regulator is located near oradjacent to the AI protein open reading frames, but it transcribed inthe opposite direction). It is believed that araC may regulate theexpression of the AI operon. (See Korbel et al., Nature Biotechnology(2004) 22(7): 911-917 for a discussion of divergently transcribedregulators in E. coli).

AI-1 may also include a sequence encoding a rho independenttranscriptional terminator (see hairpin structure in FIG. 1). Thepresence of this structure within the adhesin island is thought tointerrupt transcription after the GBS 80 open reading frame, leading toincreased expression of this surface protein.

A schematic identifying AI-1 within several GBS serotypes is depicted inFIG. 2. AI-1 sequences were identified in GBS serotype V, strain isolate2603; GBS serotype III, strain isolate NEM316; GBS serotype II, strainisolate 18RS21; GBS serotype V, strain isolate CJB111; GBS serotype III,strain isolate COH1 and GBS serotype 1a, strain isolate A909.(Percentages shown are amino acid identity to the 2603 sequence). (AnAI-1 was not identified in GBS serotype 1b, strain isolate H36B or GBSserotype 1a, strain isolate 515).

An alignment of AI-1 polynucleotide sequences from serotype V, strainisolates 2603 and CJB111; serotype II, strain isolate 18RS21; serotypeIII, strain isolates COH1 and NEM316; and serotype 1a, strain isolateA909 is presented in FIG. 18. An alignment of amino acid sequences ofAI-1 surface protein GBS 80 from serotype V, strain isolates 2603 andCJB111; serotype 1a, strain isolate A909; serotype III, strain isolatesCOH1 and NEM316 is presented in FIG. 22. An alignment of amino acidsequences of AI-1 surface protein GBS 104 from serotype V, strainisolates 2603 and CJB111; serotype III, strain isolates COH1 and NEM316;and serotype II, strain isolate 18RS21 is presented in FIG. 23.Preferred AI-1 polynucleotide and amino acid sequences are conservedamong two or more GBS serotypes or strain isolates.

As shown in this figure, the full length of surface protein GBS 80 isparticularly conserved among GBS serotypes V (strain isolates 2603 andCJBIII), III (strain isolates NEM316 and COH1), and Ia (strain isolateA909). The GBS 80 surface protein is missing or fragmented in serotypesII (strain isolate 18RS21), Ib (strain isolate H36B) and Ia (strainisolate 515).

Polynucleotide and amino acid sequences for AraC are set forth in FIG.30.

GBS Adhesin Island 2

A second adhesin island, “Adhesin Island 2” or “AI-2” or GBS AI-2” hasalso been identified in numerous GBS serotypes. A schematic depictingthe correlation between AI-1 and AI-2 within the GBS serotype V, strainisolate 2603 is shown in FIG. 3. (Homology percentages in FIG. 3represent amino acid identity of the AI-2 proteins to the AI-1proteins). Alignments of AI-2 polynucleotide sequences are presented inFIGS. 20 and 21 (FIG. 20 includes sequences from serotype V, strainisolate 2603 and serotype III, strain isolate NEM316. FIG. 21 includessequences from serotype III, strain isolate COH1 and serotype Ia, strainisolate A909). An alignment of amino acid sequences of AI-2 surfaceprotein GBS 067 from serotype V, strain isolates 2603 and CJB111;serotype 1a, strain isolate 515; serotype II, strain isolate 18RS21;serotype Ib, strain isolate H36B; and serotype III, strain isolateNEM316 is presented in FIG. 24. Preferred AI-2 polynucleotide and aminoacid sequences are conserved among two or more GBS serotypes or strainisolates.

AI-2 comprises a series of approximately five open reading framesencoding for a collection of amino acid sequences comprising surfaceproteins and sortases. Specifically, AI-2 includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5 or more) of GBS 67, GBS 59,GBS 150, SAG1405, SAG1406, 01520, 01521, 01522, 01523, 01523, 01524 and01525. In one embodiment, AI-2 includes open reading frames encoding fortwo or more of GBS 67, GBS 59, GBS 150, SAG1405, and SAG1406.Alternatively, AI-2 may include open reading frames encoding for two ormore of 01520, 01521, 01522, 01523, 01523, 01524 and 01525.

One or more of the surface proteins typically include an LPXTG motif(such as LPXTG (SEQ ID NO: 122)) or other sortase substrate motif. TheGBS AI-2 sortase proteins are thought to be involved in the secretionand anchoring of the LPXTG containing surface proteins. GBS AI-2 mayencode for at least one surface protein. Alternatively, AI-2 may encodefor at least two surface proteins and at least one sortase. Preferably,GBS AI-2 encodes for at least three surface proteins and at least twosortases. One or more of the AI-2 surface proteins may include an LPXTGor other sortase substrate motif.

One or more of the surface proteins may also typically include pilinmotif. The pilin motif may be involved in pili formation. Cleavage of AIsurface proteins by sortase between the threonine and glycine residue ofan LPXTG motif yields a thioester-linked acyl intermediate of sortase.The first lysine residue in a pilin motif can serve as an amino groupacceptor of the cleaved LPXTG motif and thereby provide a covalentlinkage between AI subunits to form pili. For example, the pilin motifcan make a nucleophilic attack on the acyl enzyme providing a covalentlinkage between AI subunits to form pili and regenerate the sortaseenzyme. Some examples of pilin motifs that may be present in the GBSAI-2 proteins include ((YPKN(X₈)K; SEQ ID NO: 158), (PK(X₈)K; SEQ ID NO:159), (YPK(X₉)K; SEQ ID NO: 160), (PKN(X₈)K; SEQ ID NO: 161), or(PK(X₁₀)K; SEQ ID NO: 162)).

One or more of the surface protein may also include an E box motif. TheE box motif contains a conserved glutamic acid residue that is believedto be necessary for pilus formation. Some examples of E box motifs mayinclude the amino acid sequences YxLxETxAPxG (SEQ ID NO: 163),YxxxExxAxxGY (SEQ ID NO: 164), YxLxExxxPxDY (SEQ ID NO: 165), orYxLxETxAPxGY (SEQ ID NO: 152).

As shown in FIG. 3, GBS AI-2 may include the surface exposed proteins ofGBS 67, GBS 59 and GBS 150 and the sortases of SAG1406 and SAG1405.Alternatively, GBS AI-2 may include the proteins 01521, 01524 and 01525and sortases 01520 and 01522. GBS 067 and 01524 are preferred AI-2surface proteins.

AI-2 may also include a divergently transcribed transcriptionalregulator such as a RofA like protein (for example rogB). As in AI-1,rogB is thought to regulate the expression of the AI-2 operon.

A schematic depiction of AI-2 within several GBS serotypes is depictedin FIG. 4. (Percentages shown are amino acid identity to the 2603sequence). While the AI-2 surface proteins GBS 59 and GBS 67 are morevariable across GBS serotypes than the corresponding AI-1 surfaceproteins, AI-2 surface protein GBS 67 appears to be conserved in GBSserotypes where the AI-1 surface proteins are disrupted or missing.

For example, as discussed above and in FIG. 2, the AI-1 GBS 80 surfaceprotein is fragmented in GBS serotype II, strain isolate 18RS21. WithinAI-2 for this same sequence, as shown in FIG. 4, the GBS 67 surfaceprotein has 99% amino acid sequence homology with the correspondingsequence in strain isolate 2603. Similarly, the AI-1 GBS 80 surfaceprotein appears to be missing in GBS serotype Ib, strain isolate H36Band GBS serotype Ia, strain isolate 515. Within AI-2 for thesesequences, however, the GBS 67 surface protein has 97-99% amino acidsequence homology with the corresponding sequence in strain isolate2603. GBS 67 appears to have two allelic variants, which can be dividedaccording to percent homology with strains 2603 and H36B. See FIGS.237-239.

Unlike for GBS 67, amino acid sequence identity of GBS 59 is variableacross different GBS strains. As shown in FIGS. 63 and 224, GBS 59 ofGBS strain isolate 2603 shares 100% amino acid residue homology with GBSstrain 18RS21, 62% amino acid sequence homology with GBS strain H36B,48% amino acid residue homology with GBS strain 515 and GBS strainCJB111, and 47% amino acid residue homology with GBS strain NEM316. Theamino acid sequence homologies of the different GBS strains suggest thatthere are two isoforms of GBS 59. The first isoform appears to includethe GBS 59 protein of GBS strains CJB111, NEM316, and 515. The secondisoform appears to include the GBS 59 protein of GBS strains 18RS21,2603, and H36B. (See FIGS. 63 and 224.) As expected from the variabilityin GBS 59 isoforms, antibodies specific for the first GBS 59 isoformdetect the first but not the second GBS 59 isoform and antibodiesspecific for the second GBS 59 isoform detect the second but not thefirst GBS 59 isoform. See FIG. 226A, which shows FACS analysis of 28 GBSstrains having a GBS 59 gene detected using PCR for GBS 59 surfaceexpression. For each of the 28 GBS strains, FACS analysis was performedusing either an antibody for GBS 59 isoform 1 (α-Cjb111) or GBS 59isoform 2 (α-2603). Only one of the two antibodies detected GBS 59surface expression on each GBS strain. As a negative control, GBSstrains in which a GBS 59 gene was not detectable by PCR did not havesignificant GBS 59 surface expression levels. FIG. 226B.

Also, GBS 59 is opsonic only against GBS strains expressing a homologousGBS 59 protein. See FIG. 225.

In one embodiment, the immunogenic composition of the inventioncomprises a first and a second isoform of the GBS 59 protein to provideprotection across a wide range of GBS serotypes that expresspolypeptides from a GBS AI-2. The first isoform may be the GBS 59protein of GBS strain CJB111, NEM316, or 515. The second isoform may bethe GBS 59 protein of GBS strain 18RS21, 2603, or H36B.

The gene encoding GBS 59 has been identified in a high number of GBSisolates; the GBS 59 gene was detected in 31 of 40 GBS isolates tested(77.5%). The GBS 59 protein also appears to be present as part of apilus in whole extracts derived from GBS strains. FIG. 64 showsdetection of high molecular weight GBS 59 polymers in whole extracts ofGBS strains CJB111, 7357B, COH31, D1363C, 5408, 1999, 5364, 5518, and515 using antiserum raised against GBS 59 of GBS strain CJB111. FIG. 65also shows detection of these high molecular weight GBS 59 polymers inwhole extracts of GBS strains D136C, 515, and CJB111 with anti-GBS 59antiserum. (See also FIG. 220 A for detection of GBS 59 high molecularweight polymers in strain 515.) FIG. 65 confirms the presence ofdifferent isoforms of GBS 59. Antisera raised against two different GBS59 isoforms results in different patterns of immunoreactivity dependingon the GBS strain origin of the whole extract. FIG. 65 further showsdetection of GBS 59 monomers in purified GBS 59 preparations.

GBS 59 is also highly expressed on the surface of GBS strains. GBS 59was detected on the surface of GBS strains CJB111, DK1, DK8, Davis, 515,2986, 5551, 1169, and 7357B by FACS analysis using mouse antiserumraised against GBS 59 of GBS CJB111. FACS analysis did not detectsurface expression of GBS 59 in GBS strains SMU071, JM9130013, and COH1,which do not contain a GBS 59 gene. (See FIG. 66.) Further confirmationthat GBS 59 is expressed on the surface of GBS is detection of GBS 59 byimmuno-electron microscopy on the surface of GBS strain 515 bacteria.See FIG. 215.

GBS 67 and GBS 150 also appear to be included in high molecular weightstructures, or pili. FIG. 69 shows that anti-GBS 67 and anti-GBS 150immunoreact with high molecular weight structures in whole GBS strain515 extracts. (See also FIG. 220 B and C.) It is also notable in FIG. 69that the anti-GBS 59 antisera, raised in a mouse following immunizationwith GBS 59 of GBS strain 2603, does not cross-hybridize with GBS 59 inGBS strain 515. GBS 59 of GBS stain 515 is of a different isotype thanGBS 59 of GBS stain 2603. See FIG. 63, which illustrates that thehomology of these two GBS 59 polypeptides is 48%, and FIG. 65, whichconfirms that GBS 59 antisera raised against GBS strain 2603 does notcross-hybridize with GBS 59 of GBS strain 515.

Formation of pili containing GBS 150 does not appear to require GBS 67expression. FIG. 70 provides Western blots showing that higher molecularweight structures in GBS strain 515 total extracts immunoreact withanti-GBS 67 and anti-GBS 150 antiserum. In a GBS strain 515 lacking GBS67 expression, anti-GBS 67 antiserum no longer immunoreacts withpolypeptides in total extracts, while anti-GBS 150 antiserum is stillable to cross-hybridze with high molecular weight structures.

Likewise, formation of pili containing GBS 59 does not appear to requireGBS 67 expression. As expected, FACS detects GBS 67 cell surfaceexpression on wildtype GBS strain 515, but not GBS strain 515 cellsknocked out for GBS 67. FACS analysis using anti-GBS 59 antisera,however, detects GBS 59 expression on both the wildtype GBS strain 515cells and the GBS strain 515 cells knocked out for GBS 67. Thus, GBS 59cell surface expression is detected on GBS stain 515 cells regardless ofGBS 67 expression.

GBS 67, while present in pili, appears to be localized around thesurface of GBS strain 515 cells. See the immuno-electron micrographspresented in FIG. 216. GBS 67 binds to fibronectin. See FIG. 217.

Formation of pili encoded by GBS AI-2 does require expression of GBS 59.Deletion of GBS 59 from strain 515 bacteria eliminates detection of highmolecular weight structures by antibodies that bind to GBS 59 (FIG. 221A, lane 3), GBS 67 (FIG. 221 B, lane 3), and GBS 150 (FIG. 221 C, lane3). By contrast, Western blot analysis of 515 bacteria with a deletionof the GBS 67 gene detects high molecular weight structures using GBS 59(FIG. 221 A, lane 2) and GBS 150 (FIG. 221 C, lane 2) antisera.Similarly, Western blot analysis of 515 bacteria with a deletion of theGBS 150 gene detects high molecular weight structures using GBS 59 (FIG.221 A, lane 4) and GBS 67 (FIG. 221 B, lane 4). See also FIG. 223, whichprovides Western blots of each of the 515 strains interrogated withantibodies for GBS 59, GBS 67, and GBS 150. FACS analysis of strain 515bacteria deleted for either GBS 59 or GBS 67 confirms these results. SeeFIG. 222, which shows that only deletion of GBS 59 abolishes surfaceexpression of both GBS 59 and GBS 67.

Formation of pili encoded by GBS AI-2 also requires expression of bothGBS adhesin island-2 encoded sortases. See FIG. 218, which providesWestern blot analysis of strain 515 bacteria lacking Srt1, Srt2, or bothSrt1 and Srt2. Only deletion of both Srt1 and Srt2 abolishes pilusassembly as detected by antibodies that cross-hybridize with each of GBS59, GBS 67 and GBS 150. The results of the Western blot analysis wereverified by FACS, which provided similar results. See FIG. 219.

As shown in FIG. 4, two of the GBS strain isolates (COH1 and A909) donot appear to contain homologues to the surface proteins GBS 59 and GBS67. For these two strains, the percentages shown in FIG. 4 are aminoacid identity to the COH1 protein). Notwithstanding the difference inthe surface protein lengths for these two strains, AI-2 within thesesequences still contains two sortase proteins and three LPXTG containingsurface proteins, as well as a signal peptidase sequence leading intothe first surface protein. One of the surface proteins in this variantof AI-2, spb1, has previously been identified as a potential adhesionprotein. (See Adderson et al., Infection and Immunity (2003)71(12):6857-6863). Alternatively, because of the lack of GBS 59 and GBS67 sequences, this variant of AI-2 may be a third type of AI (AdhesinIsland-3, AI-3, or GBS AI-3).

More than one AI surface protein may be present in the oligomeric,pilus-like structures of the invention. For example, GBS 59 and GBS 67may be incorporated into an oligomeric structure. Alternatively, GBS 59and GBS 150 may be incorporated into an oligomeric structure, or GBS 59,GBS 150 and GBS 67 may be incorporated into an oligomeric structure.

In another embodiment, the invention includes compositions comprisingtwo or more AI surface proteins. The composition may include surfaceproteins from the same adhesin island. For example, the composition mayinclude two or more GBS AI-2 surface proteins, such as GBS 59, GBS 67and GBS 150. The surface proteins may be isolated from Gram positvebacteria or they may be produced recombinantly.

GAS Adhesin Islands

As discussed above, Applicants have identified at least four differentGAS Adhesin Islands. These adhesion islands are thought to encodesurface proteins which are important in the bacteria's virulence, andApplicants have obtained the first electron micrographs revealing thepresence of these adhesin island proteins in hyperoligomeric pilusstructures on the surface of Group A Streptococcus.

Group A Streptococcus is a human specific pathogen which causes a widevariety of diseases ranging from pharyngitis and impetigo through lifethreatening invasive disease and necrotizing fascilitis. In addition,post-streptococcal autoimmune responses are still a major cause ofcardiac pathology in children.

Group A Streptococcal infection of its human host can generally occur inthree phases. The first phase involves attachment and/or invasion of thebacteria into host tissue and multiplication of the bacteria within theextracellular spaces. Generally this attachment phase begins in thethroat or the skin. The deeper the tissue level infected, the moresevere the damage that can be caused. In the second stage of infection,the bacteria secretes a soluble toxin that diffuses into the surroundingtissue or even systemically through the vasculature. This toxin binds tosusceptible host cell receptors and triggers innappropropriate immuneresponses by these host cells, resulting in pathology. Because the toxincan diffuse throughout the host, the necrosis directly caused by the GAStoxins may be physically located in sites distant from the bacterialinfection. The final phase of GAS infection can occur long after theoriginal bacteria have been cleared from the host system. At this stage,the host's previous immune response to the GAS bacteria due to crossreactivity between epitopes of a GAS surface protein, M, and hosttissues, such as the heart. A general review of GAS infection can befound in Principles of Bacterial Pathogeneis, Groisman ed., Chapter 15(2001).

In order to prevent the pathogenic effects associated with the laterstages of GAS infection, an effective vaccine against GAS willpreferably facilitate host elimination of the bacteria during theinitial attachment and invasion stage.

Isolates of Group A Streptococcus are historically classified accordingto the M surface protein described above. The M protein is surfaceexposed trypsin-sensitive protein generally comprising two polypeptidechains complexed in an alpha helical formation. The carboxyl terminus isanchored in the cytoplasmic membrane and is highly conserved among allgroup A streptococci. The amino terminus, which extend through the cellwall to the cell surface, is responsible for the antigenic variabilityobserved among the 80 or more serotypes of M proteins.

A second layer of classification is based on a variable,trypsin-resistant surface antigen, commonly referred to as theT-antigen. Decades of epidemiology based on M and T serological typinghave been central to studies on the biological diversity and diseasecausing potential of Group A Streptococci. While the M-protein componentand its inherent variability have been extensively characterized, evenafter five decades of study, there is still very little known about thestructure and variability of T-antigens. Antisera to define T types iscommercially available from several sources, including Sevapharma(http://www.sevapharma.cz/en).

The gene coding for one form of T-antigen, T-type 6, from an M6 strainof GAS (D741) has been cloned and characterized and maps to anapproximately 11 kb highly variable pathogenicity island. Schneewind etal., J. Bacteriol. (1990) 172(6):3310-3317. This island is known as theFibronectin-binding, Collagen-binding T-antigen (FCT) region because itcontains, in addition to the T6 coding gene (tee6), members of a familyof genes coding for Extra Cellular Matrix (ECM) binding proteins. Bessenet al., Infection & Immunity (2002) 70(3):1159-1167. Several of theprotein products of this gene family have been shown to directly bindeither fibronectin and/or collagen. See Hanski et al., Infection &Immunity (1992) 60(12):5119-5125; Talay et al., Infection & Immunity(1992(60(9):3837-3844; Jaffe et al. (1996) 21(2):373-384; Rocha et al.,Adv Exp Med Biol. (1997) 418:737-739; Kreikemeyer et al., J Biol Chem(2004) 279(16):15850-15859; Podbielski et al., Mol. Microbiol. (1999)31(4):1051-64; and Kreikemeyer et al., Int. J. Med Microbiol (2004)294(2-3):177-88. In some cases direct evidence for a role of theseproteins in adhesion and invasion has been obtained.

Applicants raised antiserum against a recombinant product of the tee6gene and used it to explore the expression of T6 in M6 strain 2724. Inimmunoblot of mutanolysin extracts of this strain, the antiserumrecognized, in addition to a band corresponding to the predictedmolecular mass of the product, very high molecular weight laddersranging in mobility from about 100 kDa to beyond the resolution of the3-8% gradient gels used.

This pattern of high molecular weight products is similar to thatobserved in immunoblots of the protein components of the pili identifiedin Streptococcus agalactiae (described above) and previously inCorynebacterium diphtheriae. Electron microscropy of strain M6_(—)2724with antisera specific for the product of tee6 revealed abundant surfacestaining and long pilus like structures extending up to 700 nanometersfrom the bacterial surface, revealing that the T6 protein, one of theantigens recognized in the original Lancefiled serotyping system, islocated within a GAS Adhesin Island (GAS AI-1) and forms long covalentlylinked pilus structures.

Applicants have identified at least four different Group A StreptococcusAdhesin Islands. While these GAS AI sequences can be identified innumerous M types, Applicants have surprisingly discovered a correlationbetween the four main pilus subunits from the four different GAS AItypes and specific T classifications. While other trypsin-resistantsurface exposed proteins are likely also implicated in the Tclassification designations, the discovery of the role of the GASadhesin islands (and the associated hyper-oligomeric pilus likestructures) in T classification and GAS serotype variance has importantimplications for prevention and treatment of GAS infections. Applicantshave identified protein components within each of the GAS adhesinislands which are associated with the pilus formation. These proteinsare believed to be involved in the bacteria's initial adherencemechanisms. Immunological recognition of these proteins may allow thehost immune response to slow or prevent the bacteria's transition intothe more pathogenic later stages of infection.

In addition, Applicants have discovered that the GBS pili structuresappear to be implicated in the formation of biofilms (populations ofbacteria growing on a surface, often enclosed in an exopolysaccharidematrix). Biofilms are generally associated with bacterial resistance, asantibiotic treatments and host immune response are frequently unable toerradicate all of the bacteria components of the biofilm. Direction of ahost immune response against surface proteins exposed during the firststeps of bacterial attachment (i.e., before complete biofilm formation)is preferable.

The invention therefore provides for improved immunogenic compositionsagainst GAS infection which may target GAS bacteria during their initialattachment efforts to the host epithelial cells and may provideprotection against a wide range of GAS serotypes. The immunogeniccompositions of the invention include GAS AI surface proteins which maybe formulated in an oligomeric, or hyperoligomeric (pilus) form. Theinvention also includes combinations of GAS AI surface proteins.Combinations of GAS AI surface proteins may be selected from the sameadhesin island or they may be selected from different GAS adhesinislands.

While there is surprising variability in the number and sequence of theGAS AI components across isolates, GAS AI sequences may be generallycharacterized as Type 1, Type 2, Type 3, and Type 4, depending on thenumber and type of sortase sequence within the island and the percentageidentity of other proteins within the island. Schematics of the GASadhesin islands are set forth in FIG. 51A and FIG. 162. In all strainsidentified so far, the adhesin island region is flanked by highlyconserved open reading frames M1_(—)123 and M1_(—)136. Between three andfive genes in each GAS adhesin island code for ECM binding adhesinproteins containing LPXTG motifs.

GAS Adhesin Island 1

As discussed above, Applicants have identified adhesin islands, “GASAdhesin Island 1” or “GAS AI-1”, within the genome Group A Streptococcusserotypes and isolates. GAS AI-1 comprises a series of approximatelyfive open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases (“GAS AI-1proteins”). GAS AI-1 preferably comprises surface proteins, a srtBsortase, and a rofA divergently transcribed transcriptional regulator.GAS AI-1 surface proteins may include a fibronectin binding protein, acollagen adhesion protein and a fimbrial structural subunit. Preferably,each of these GAS AI-1 surface proteins includes an LPXTG sortasesubstrate motif, such as LPXTG (SEQ ID NO: 122) or LPXSG (SEQ ID NO:134) (conservative replacement of threonine with serine). Specifically,GAS AI-1 includes open reading frames encoding for two or more (i.e., 2,3, 4 or 5) of M6_Spy0157, M6_Spy0158, M6_Spy0159, M6_Spy0160,M6_Spy0161.

Applicants have also identified open reading frames encoding fimbrialstructural subunits in other GAS bacteria harbouring an AI-1. These openreading frames encode fimbrial structural subunits CDC SS 410_fimbrial,ISS3650_fimbrial, and DSM2071_fimbrial. A GAS AI-1 may comprise apolynucleotide encoding any one of CDC SS 410_fimbrial,ISS3650_fimbrial, and DSM2071_fimbrial.

As discussed above, the hyper-oligomeric pilus structure of GAS AI-1appears to be responsible for the T-antigen type 6 classification, andGAS AI-1 corresponds to the FCT region previously identified for tee6.As in GAS AI-1, the tee6 FCT region includes open reading framesencoding for a collagen adhesion protein (cpa, capsular polysaccharideadhesion) and a fibronectin binding protein (prtF1). Immunoblots oftee6, a GAS AI-1 fimbrial structural subunit corresponding to M6_Spy160,reveal high molecular weight structures indicative of thehyper-oligomeric pilus structures. Immunoblots with antiserum specificfor Cpa also recognize a high molecular weight ladder structure,indicating Cpa involvement in the GAS AI-1 pilus structure or formation.In EM photos of GAS bacteria, Cpa antiserum reveals abundant staining onthe surface of the bacteria and occasional gold particles extended fromthe surface of the bacteria. In contrast, immunoblots with antiserumspecific for PrtF1 recognize only a single molecular species withelectrophoretic mobility corresponding to its predicted molecular mass,indicating that PrtF1 may not be associated with the oligomeric pilusstructure. A preferred immunogenic composition of the inventioncomprises a GAS AI-1 surface protein which may be formulated or purifiedin an oligomeric (pilis) form. In a preferred embodiment, the oligomericform is a hyperoligomer. Another preferred immunogenic composition ofthe invention comprises a GAS AI-1 surface protein which has beenisolated in an oligomeric (pilis) form. The oligomer or hyperoligomericpilus structures comprising the GAS AI-1 surface proteins may bepurified or otherwise formulate for use in immunogenic compositions.

One or more of the GAS AI-1 open reading frame polynucleotide sequencesmay be replaced by a polynucleotide sequence coding for a fragment ofthe replaced ORF. Alternatively, one or more of the GAS AI-1 openreading frames may be replaced by a sequence having sequence homology tothe replaced ORF.

One or more of the GAS AI-1 surface protein sequences typically includean LPXTG motif (such as LPXTG (SEQ ID NO: 122)) or other sortasesubstrate motif.

The LPXTG sortase substrate motif of a GAS AI surface protein may begenerally represented by the formula XXXXG, wherein X at amino acidposition 1 is an L, a V, an E, or a Q, wherein X at amino acid position2 is a P if X at amino acid position 1 is an L, wherein X at amino acidposition 2 is a V if X at amino acid position 1 is a E or a Q, wherein Xat amino acid position 2 is a V or a P if X at amino acid position 1 isa V, wherein X at amino acid position 3 is any amino acid residue,wherein X at amino acid position 4 is a T if X at amino acid position 1is a V, E, or Q, and wherein X at amino acid position 4 is a T, S, or Aif X at amino acid position 1 is an L. Some examples of LPXTG motifspresent in GAS AI surface proteins include LPSXG (SEQ ID NO: 134), VVXTG(SEQ ID NO: 135), EVXTG (SEQ ID NO: 136), VPXTG (SEQ ID NO: 137), QVXTG(SEQ ID NO: 138), LPXAG (SEQ ID NO: 139), QVPTG (SEQ ID NO: 140), andFPXTG (SEQ ID NO: 141).

The GAS AI surface proteins of the invention may affect the ability ofthe GAS bacteria to adhere to and invade epithelial cells. AI surfaceproteins may also affect the ability of GAS to translocate through anepithelial cell layer. Preferably, one or more GAS AI surface proteinsare capable of binding to or otherwise associating with an epithelialcell surface. GAS AI surface proteins may also be able to bind to orassociate with fibrinogen, fibronectin, or collagen.

The GAS AI-1 sortase proteins are predicted to be involved in thesecretion and anchoring of the LPXTG containing surface proteins. GASAI-1 may encode for at least one surface protein. Alternatively, GASAI-1 may encode for at least two surface exposed proteins and at leastone sortase. Preferably, GAS AI-1 encodes for at least three surfaceexposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

GAS AI-1 preferably includes a srtB sortase. GAS srtB sortases maypreferably anchor surface proteins with an LPSTG motif (SEQ ID NO: 166),particularly where the motif is followed by a serine.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a GAS AI-1 surface proteinsuch as M6_Spy0157, M6_Spy0159, M6_Spy0160, CDC SS 410_fimbrial,ISS3650_fimbrial, or DSM2071_fimbrial. The oligomeric, pilus-likestructure may comprise numerous units of AI surface protein. Preferably,the oligomeric, pilus-like structures comprise two or more AI surfaceproteins. Still more preferably, the oligomeric, pilus-like structurecomprises a hyper-oligomeric pilus-like structure comprising at leasttwo (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30,35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140, 150, 200 or more)oligomeric subunits, wherein each subunit comprises an AI surfaceprotein or a fragment thereof. The oligomeric subunits may be covalentlyassociated via a conserved lysine within a pilin motif. The oligomericsubunits may be covalently associated via an LPXTG motif, preferably,via the threonine or serine amino acid residue, respectively:

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a GAS Adhesin Island protein in oligomeric form, preferably ina hyperoligomeric form. In one embodiment, the invention comprises acomposition comprising one or more GAS Adhesin Island 1 (“GAS AI-1”)proteins and one or more GAS Adhesin Island 2 (“GAS AI-2”), GAS AdhesinIsland 3 (“GAS AI-3”), or GAS Adhesin Island 4 (“GAS AI-4”) proteins,wherein one or more of the GAS Adhesin Island proteins is in the form ofan oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the GAS AI-1 proteins,GAS AI-1 may also include a divergently transcribed transcriptionalregulator such as RofA (i.e., the transcriptional regulator is locatednear or adjacent to the AI protein open reading frames, but ittranscribed in the opposite direction).

GAS Adhesin Island 2

A second adhesin island, “GAS Adhesin Island 2” or “GAS AI-2” has alsobeen identified in Group A Streptococcus serotypes and isolates. GASAI-2 comprises a series of approximately eight open reading framesencoding for a collection of amino acid sequences comprising surfaceproteins and sortases (“GAS AI-2 proteins”). Specifically, GAS AI-2includes open reading frames encoding for two or more (i.e., 2, 3, 4, 5,6, 7, or 8) of GAS15, Spy0127, GAS16, GAS17, GAS18, Spy0131, Spy0133,and GAS20.

A preferred immunogenic composition of the invention comprises a GASAI-2 surface protein which may be formulated or purified in anoligomeric (pilis) form. In a preferred embodiment, the oligomeric formis a hyperoligomer. Another preferred immunogenic composition of theinvention comprises a GAS AI-2 surface protein which has been isolatedin an oligomeric (pilis) form. The oligomer or hyperoligomeric pilusstructures comprising the GAS AI-2 surface proteins may be purified orotherwise formulate for use in immunogenic compositions.

One or more of the GAS AI-2 open reading frame polynucleotide sequencesmay be replaced by a polynucleotide sequence coding for a fragment ofthe replaced ORF. Alternatively, one or more of the GAS AI-2 openreading frames may be replaced by a sequence having sequence homology tothe replaced ORF.

One or more of the GAS AI-2 surface protein sequences typically includean LPXTG motif (such as LPXTG (SEQ ID NO: 122)) or other sortasesubstrate motif. The AI surface proteins of the invention may affect theability of the GAS bacteria to adhere to and invade epithelial cells. AIsurface proteins may also affect the ability of GAS to translocatethrough an epithelial cell layer. Preferably, one or more AI surfaceproteins are capable of binding to or otherwise associating with anepithelial cell surface. AI surface proteins may also be able to bind toor associate with fibrinogen, fibronectin, or collagen.

The GAS AI-2 sortase proteins are predicted to be involved in thesecretion and anchoring of the LPXTG containing surface proteins. GASAI-2 may encode for at least one surface protein. Alternatively, GASAI-2 may encode for at least two surface exposed proteins and at leastone sortase. Preferably, GAS AI-2 encodes for at least three surfaceexposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising an AI surface protein suchas GAS 15, GAS 16, or GAS 18. The oligomeric, pilus-like structure maycomprise numerous units of AI surface protein. Preferably, theoligomeric, pilus-like structures comprise two or more AI surfaceproteins. Still more preferably, the oligomeric, pilus-like structurecomprises a hyper-oligomeric pilus-like structure comprising at leasttwo (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30,35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140, 150, 200 or more)oligomeric subunits, wherein each subunit comprises an AI surfaceprotein or a fragment thereof. The oligomeric subunits may be covalentlyassociated via a conserved lysine within a pilin motif. The oligomericsubunits may be covalently associated via an LPXTG motif, preferably,via the threonine amino acid residue.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a GAS Adhesin Island protein in oligomeric form, preferably ina hyperoligomeric form. In one embodiment, the invention comprises acomposition comprising one or more GAS Adhesin Island 2 (“GAS AI-2”)proteins and one or more GAS Adhesin Island 1 (“GAS AI-1”), GAS AdhesinIsland 3 (“GAS AI-3”), or GAS Adhesin Island 4 (“GAS AI-4”) proteins,wherein one or more of the Adhesin Island proteins is in the form of anoligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the GAS AI-2 proteins,GAS AI-2 may also include a divergently transcribed transcriptionalregulator such as rofA (i.e., the transcriptional regulator is locatednear or adjacent to the AI protein open reading frames, but ittranscribed in the opposite direction).

GAS Adhesin Island 3

A third adhesin island, “GAS Adhesin Island 3” or “GAS AI-3” has alsobeen identified in several Group A Streptococcus serotypes and isolates.GAS AI-3 comprises a series of approximately seven open reading framesencoding for a collection of amino acid sequences comprising surfaceproteins and sortases (“GAS AI-3 proteins”). Specifically, GAS AI-3includes open reading frames encoding for two or more (i.e., 2, 3, 4, 5,6, or 7) of SpyM3_(—)0098, SpyM3_(—)0099, SpyM3_(—)0100, SpyM3_(—)0101,SpyM3_(—)0102, SpyM3_(—)0103, SpyM3_(—)0104, SPs0100, SPs0101, SPs0102,SPs0103, SPs0104, SPs0105, SPs0106, orf78, orf79, orf80, orf81, orf82,orf83, orf84, spyM18_(—)0126, spyM18_(—)0127, spyM18_(—)0128,spyM18_(—)0129, spyM18_(—)0130, spyM18_(—)0131, spyM18_(—)0132,SpyoM01000156, SpyoM01000155, SpyoM01000154, SpyoM01000153,SpyoM01000152, SpyoM01000151, SpyoM01000150, and SpyoM01000149. In oneembodiment, GAS AI-3 includes open reading frames encoding for two ormore (i.e., 2, 3, 4, 5, 6, or 7) of SpyM3_(—)0098, SpyM3_(—)0099,SpyM3_(—)0100, SpyM3_(—)0101, SpyM3_(—)0102, SpyM3_(—)0103, andSpyM3_(—)0104. In another embodiment, GAS AI-3 includes open readingframes encoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of SPs0100,SPs0101, SPs0102, SPs0103, SPs0104, SPs0105, and SPs0106. In a furtherembodiment, GAS AI-3 includes open reading frames encoding for two ormore (i.e., 2, 3, 4, 5, 6, or 7) of orf78, orf79, orf80, orf81, orf82,orf83, and orf84. In yet another embodiment, GAS AI-3 includes openreading frames encoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) ofspyM18_(—)0126, spyM18_(—)0127, spyM18_(—)0128, spyM18_(—)0129,spyM18_(—)0130, spyM18_(—)0131, and spyM18_(—)0132. In yet anotherembodiment, GAS AI-3 includes open reading frames encoding for two ormore (i.e., 2, 3, 4, 5, 6, or 7) of SpyoM01000156, SpyoM01000155,SpyoM01000154, SpyoM01000153, SpyoM01000152, SpyoM01000151,SpyoM01000150, and SpyoM01000149.

Applicants have also identified open reading frames encoding fimbrialstructural subunits in other GAS bacteria harbouring an AI-3. These openreading frames encode fimbrial structural subunits ISS3040_fimbrial,ISS3776_fimbrial, and ISS4959_fimbrial. A GAS AI-3 may comprise apolynucleotide encoding any one of ISS3040_fimbrial, ISS3776_fimbrial,and ISS4959_fimbrial.

One or more of the GAS AI-3 open reading frame polynucleotide sequencesmay be replaced by a polynucleotide sequence coding for a fragment ofthe replaced ORF. Alternatively, one or more of the GAS AI-3 openreading frames may be replaced by a sequence having sequence homology tothe replaced ORF.

A preferred immunogenic composition of the invention comprises a GASAI-3 surface protein which may be formulated or purified in anoligomeric (pilis) form. In a preferred embodiment, the oligomeric formis a hyperoligomer. Another preferred immunogenic composition of theinvention comprises a GAS AI-3 surface protein which has been isolatedin an oligomeric (pilis) form. The oligomer or hyperoligomeric pilusstructures comprising the GAS AI-3 surface proteins may be purified orotherwise formulate for use in immunogenic compositions.

One or more of the GAS AI-3 surface protein sequences typically includean LPXTG motif (such as LPXTG (SEQ ID NO: 122)) or other sortasesubstrate motif. The AI surface proteins of the invention may affect theability of the GAS bacteria to adhere to and invade epithelial cells. AIsurface proteins may also affect the ability of GAS to translocatethrough an epithelial cell layer. Preferably, one or more AI surfaceproteins are capable of binding to or otherwise associating with anepithelial cell surface. AI surface proteins may also be able to bind toor associate with fibrinogen, fibronectin, or collagen.

The GAS AI-3 sortase proteins are predicted to be involved in thesecretion and anchoring of the LPXTG containing surface proteins. GASAI-3 may encode for at least one surface protein. Alternatively, GASAI-3 may encode for at least two surface exposed proteins and at leastone sortase. Preferably, GAS AI-3 encodes for at least three surfaceexposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threonine oralanine carboxyl group and a cell wall precursor such as lipid II. Theprecursor can then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

The invention includes a composition comprising oligomeric, pilus-likestructures comprising an AI surface protein such as SpyM3_(—)0098,SpyM3_(—)0100, SpyM3_(—)0102, SpyM3_(—)0104, SPs0100, SPs0102, SPs0104,SPs0106, orf78, orf80, orf82, orf84, spyM18_(—)0126, spyM18_(—)0128,spyM18_(—)0130, spyM18_(—)0132, SpyoM01000155, SpyoM01000153,SpyoM01000151, SpyoM01000149, ISS3040_fimbrial, ISS3776_fimbrial, andISS4959_fimbrial. In one embodiment, the invention includes acomposition comprising oligomeric, pilus-like structures comprising anAI surface protein such as SpyM3_(—)0098, SpyM3_(—)0100, SpyM3_(—)0102,and SpyM3_(—)0104. In another embodiment, the invention includes acomposition comprising oligomeric, pilus-like structures comprising anAI surface protein such as SPs0100, SPs0102, SPs0104, and SPs0106. Inanother embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising an AI surface protein suchas orf78, orf80, orf82, and orf84. In yet another embodiment, theinvention includes a composition comprising oligomeric, pilus-likestructures comprising an AI surface protein such as spyM18_(—)0126,spyM18_(—)0128, spyM18_(—)0130, and spyM18_(—)0132. In a furtherembodiment, the invention includes a composition comprising oligomeric,pilus-like structures comprising an AI surface protein such asSpyoM01000155, SpyoM01000153, SpyoM0000151, and SpyoM1000149. In yet afurther embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising an AI surface protein suchas ISS3040_fimbrial, ISS3776_fimbrial, and ISS4959_fimbrial. Theoligomeric, pilus-like structure may comprise numerous units of AIsurface protein. Preferably, the oligomeric, pilus-like structurescomprise two or more AI surface proteins. Still more preferably, theoligomeric, pilus-like structure comprises a hyper-oligomeric pilus-likestructure comprising at least two (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120,140, 150, 200 or more) oligomeric subunits, wherein each subunitcomprises an AI surface protein or a fragment thereof. The oligomericsubunits may be covalently associated via a conserved lysine within apilin motif. The oligomeric subunits may be covalently associated via anLPXTG motif, preferably, via the threonine amino acid residue.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a GAS Adhesin Island protein in oligomeric form, preferably ina hyperoligomeric form. In one embodiment, the invention comprises acomposition comprising one or more GAS Adhesin Island 3 (“GAS AI-3”)proteins and one or more GAS Adhesin Island 1 (“GAS AI-1”), GAS AdhesinIsland 2 (“GAS AI-2”), or GAS Adhesin Island 4 (“GAS AI-4”) proteins,wherein one or more of the Adhesin Island proteins is in the form of anoligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the GAS AI-3 proteins,GAS AI-3 may also include a transcriptional regulator such as Nra.

GAS Adhesin Island 4

A fourth adhesin island, “GAS Adhesin Island 4” or “GAS AI-4” has alsobeen identified in Group A Streptococcus serotypes and isolates. GASAI-4 comprises a series of approximately eight open reading framesencoding for a collection of amino acid sequences comprising surfaceproteins and sortases (“GAS AI-4 proteins”). Specifically, GAS AI-4includes open reading frames encoding for two or more (i.e., 2, 3, 4, 5,6, 7, or 8) of 19224134, 19224135, 19223136, 19223137, 19224138,19224139, 19224140, and 19224141.

Applicants have also identified open reading frames encoding fimbrialstructural subunits in other GAS bacteria harbouring an AI-4. These openreading frames encode fimbrial structural subunits 20010296_fimbrial,20020069_fimbrial, CDC SS 635_fimbrial, ISS4883_fimbrial, andISS4538_fimbrial. A GAS AI-4 may comprise a polynucleotide encoding anyone of 20010296_fimbrial, 20020069_fimbrial, CDC SS 635_fimbrial,ISS4883_fimbrial, and ISS4538_fimbrial.

One or more of the GAS AI-4 open reading frame polynucleotide sequencesmay be replaced by a polynucleotide sequence coding for a fragment ofthe replaced ORF. Alternatively, one or more of the GAS AI-4 openreading frames may be replaced by a sequence having sequence homology tothe replaced ORF.

A preferred immunogenic composition of the invention comprises a GASAI-4 surface protein which may be formulated or purified in anoligomeric (pilis) form. In a preferred embodiment, the oligomeric formis a hyperoligomer. Another preferred immunogenic composition of theinvention comprises a GAS AI-4 surface protein which has been isolatedin an oligomeric (pilis) form. The oligomer or hyperoligomeric pilusstructures comprising the GAS AI-4 surface proteins may be purified orotherwise formulate for use in immunogenic compositions.

One or more of the GAS AI-4 surface protein sequences typically includean LPXTG motif (such as LPXTG (SEQ ID NO: 122)) or other sortasesubstrate motif. The AI surface proteins of the invention may effect theability of the GAS bacteria to adhere to and invade epithelial cells. AIsurface proteins may also affect the ability of GAS to translocatethrough an epithelial cell layer. Preferably, one or more AI surfaceproteins are capable of binding to or otherwise associating with anepithelial cell surface. AI surface proteins may also be able to bind toor associate with fibrinogen, fibronectin, or collagen.

The GAS AI-4 sortase proteins are predicted to be involved in thesecretion and anchoring of the LPXTG containing surface proteins. GASAI-4 may encode for at least one surface protein. Alternatively, GASAI-4 may encode for at least two surface exposed proteins and at leastone sortase. Preferably, GAS AI-4 encodes for at least three surfaceexposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising an AI surface protein suchas 19224134, 19224135, 19224137, 19224139, 19224141, 20010296_fimbrial,20020069_fimbrial, CDC SS 635_fimbrial, ISS4883_fimbrial, andISS4538_fimbrial. The oligomeric, pilus-like structure may comprisenumerous units of AI surface protein. Preferably, the oligomeric,pilus-like structures comprise two or more AI surface proteins. Stillmore preferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineamino acid residue.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a GAS Adhesin Island protein in oligomeric form, preferably ina hyperoligomeric form. In one embodiment, the invention comprises acomposition comprising one or more GAS Adhesin Island 4 (“GAS AI-4”)proteins and one or more GAS Adhesin Island 1 (“GAS AI-1”), GAS AdhesinIsland 2 (“GAS AI-2”), or GAS Adhesin Island 3 (“GAS AI-3”) proteins,wherein one or more of the Adhesin Island proteins is in the form of anoligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the GAS AI-4 proteins,GAS AI-4 may also include a divergently transcribed transcriptionalregulator such as rofA (i.e., the transcriptional regulator is locatednear or adjacent to the AI protein open reading frames, but ittranscribed in the opposite direction).

The oligomeric, pilus-like structures of the invention may be combinedwith one or more additional GAS proteins. In one embodiment, theoligomeric, pilus-like structures comprise one or more AI surfaceproteins in combination with a second GAS protein.

The oligomeric, pilus-like structures may be isolated or purified frombacterial cultures in which the bacteria express an AI surface protein.The invention therefore includes a method for manufacturing anoligomeric AI surface antigen comprising culturing a GAS bacterium thatexpresses the oligomeric AI protein and isolating the expressedoligomeric AI protein from the GAS bacteria. The AI protein may becollected from secretions into the supernatant or it may be purifiedfrom the bacterial surface. The method may further comprise purificationof the expressed AI protein. Preferably, the AI protein is in ahyperoligomeric form.

The oligomeric, pilus-like structures may be isolated or purified frombacterial cultures overexpressing an AI surface protein. The inventiontherefore includes a method for manufacturing an oligomeric AdhesinIsland surface antigen comprising culturing a GAS bacterium adapted forincreased AI protein expression and isolation of the expressedoligomeric Adhesin Island protein from the GAS bacteria. The AI proteinmay be collected from secretions into the supernatant or it may bepurified from the bacterial surface. The method may further comprisepurification of the expressed Adhesin Island protein. Preferably, theAdhesin Island protein is in a hyperoligomeric form.

The GAS bacteria are preferably adapted to increase AI proteinexpression by at least two (e.g., 2, 3, 4, 5, 8, 10, 15, 20, 25, 30, 35,40, 45, 50, 60, 70, 80, 90, 100, 125, 150 or 200) times wild typeexpression levels.

GAS bacteria may be adapted to increase AI protein expression by anymeans known in the art, including methods of increasing gene dosage andmethods of gene upregulation. Such means include, for example,transformation of the GAS bacteria with a plasmid encoding the AIprotein. The plasmid may include a strong promoter or it may includemultiple copies of the sequence encoding the AI protein. Optionally, thesequence encoding the AI protein within the GAS bacterial genome may bedeleted. Alternatively, or in addition, the promoter regulating the GASAdhesin Island may be modified to increase expression.

The invention further includes GAS bacteria which have been adapted toproduce increased levels of AI surface protein. In particular, theinvention includes GAS bacteria which have been adapted to produceoligomeric or hyperoligomeric AI surface protein. In one embodiment, theGram positive bacteria of the invention are inactivated or attenuated topermit in vivo delivery of the whole bacteria, with the AI surfaceprotein exposed on its surface.

The invention further includes GAS bacteria which have been adapted tohave increased levels of expressed AI protein incorporated in pili ontheir surface. The GAS bacteria may be adapted to have increasedexposure of oligomeric or hyperoligomeric AI proteins on its surface byincreasing expression levels of LepA polypeptide, or an equivalentsignal peptidase, in the GAS bacteria. Applicants have shown thatdeletion of LepA in strain SF370 bacteria, which harbour a GAS AI-2,abolishes surface exposure of M and pili proteins on the GAS. Increasedlevels of LepA expression in GAS are expected to result in increasedexposure of M and pili proteins on the surface of GAS. Increasedexpression of LepA in GAS may be achieved by any means known in the art,such as increasing gene dosage and methods of gene upregulation. The GASbacteria adapted to have increased levels of LepA expression mayadditionally be adapted to express increased levels of at least one piliprotein.

Alternatively, the AI proteins of the invention may be expressed on thesurface of a non-pathogenic Gram positive bacteria, such as Streptococusgordonii (See, e.g., Byrd et al., “Biological consequences of antigenand cytokine co-expression by recombinant Streptococcus gordonii vaccinevectors”, Vaccine (2002) 20:2197-2205) or Lactococcus lactis (See, e.g.,Mannam et al., “Mucosal Vaccine Made from Live, Recombinant Lactococcuslactis Protects Mice against Pharangeal Infection with Streptococcuspyogenes” Infection and Immunity (2004) 72(6):3444-3450). As usedherein, non-pathogenic Gram positive bacteria refer to Gram positivebacteria which are compatible with a human host subject and are notassociated with human pathogenisis. Preferably, the non-pathogenicbacteria are modified to express the AI surface protein in oligomeric,or hyper-oligomeric form. Sequences encoding for an AI surface proteinand, optionally, an AI sortase, may be integrated into thenon-pathogenic Gram positive bacterial genome or inserted into aplasmid. The non-pathogenic Gram positive bacteria may be inactivated orattenuated to facilitate in vivo delivery of the whole bacteria, withthe AI surface protein exposed on its surface. Alternatively, the AIsurface protein may be isolated or purified from a bacterial culture ofthe non-pathogenic Gram positive bacteria. For example, the AI surfaceprotein may be isolated from cell extracts or culture supernatants.Alternatively, the AI surface protein may be isolated or purified fromthe surface of the non-pathogenic Gram positive bacteria.

The non-pathogenic Gram positive bacteria may be used to express any ofthe GAS Adhesin Island proteins described herein. The non-pathogenicGram positive bacteria are transformed to express an Adhesin Islandsurface protein. Preferably, the non-pathogenic Gram positive bacteriaalso express at least one Adhesin Island sortase. The AI transformednon-pathogenic Gram positive bacteria of the invention may be used toprevent or treat infection with pathogenic GAS.

Applicants modified L. lactis to demonstrate that, like GBSpolypeptides, it can express GAS AI polypeptides. L. lactis wastransformed with pAM401 constructs encoding entire pili gene clusters ofAI-1, AI-2, and AI-4 adhesin islands. Briefly, the pAM401 is apromoterless high-copy plasmid. The entire pili gene clusters of an M6(AI-1), M1 (AI-2), and M12 (AI-4) bacteria were inserted into the pAM401construct. The gene clusters were transcribed under the control theirown (M6, M1, or M12) promoter or the GBS promoter that successfullyinitiated expression of the GBS AI-1 adhesin islands in L. lactis,described above. FIG. 172 provides a schematic depiction of GAS M6(AI-1), M1 (AI-2), and M12 (AI-4) adhesin islands and indicates theportions of the adhesin island sequences inserted in the pAM401construct.

Each of the L. lactis transformed with one of the M6, M1, or M12 adhesinisland gene clusters expressed high molecular weight structures thatwere immunoreactive with antibodies that bind to polypeptides present intheir respective pili. FIGS. 173 A-C provide results of Western blotanalysis of surface protein-enriched extracts of L. lactis transformedwith M6 (FIG. 173 A), M1 (FIG. 173 B), or M12 (FIG. 173 C) adhesinisland gene clusters using antibodies that bind to the fimbrialstructural subunit encoded by each cluster. FIG. 173A at lanes 3 and 4shows detection of high molecular structures in L. lactis transformedwith an adhesin island pilus gene cluster from an M1 AI-2 using anantibody that binds to fimbrial structural subunit Spy0128. FIG. 173B atlanes 3 and 4 shows detection of high molecular weight structures in L.lactis transformed with an adhesin island pilus gene cluster from an M12AI-4 using an antibody that binds to fimbrial structural subunitEftLSL.A. FIG. 173C at lane 3 shows detection of high molecular weightstructures in L. lactis transformed with an adhesin island pilus genecluster from an M6 AI-1 using an antibody that binds to fimbrialstructural subunit M6_Spy0160. In FIGS. 173 A-C, “p1” immediatelyfollowing the notation of AI subtype indicates that the promoter presentin the Adhesin Island is used to drive transcription of the adhesinisland gene cluster and “p2” indicates that the promoter was the GBSpromoter described above. Thus, it appears that L. lactis is capable ofexpressing the fimbrial structural subunits encoded by GAS adhesinislands in an oligomeric form.

Alternatively, the oligomeric, pilus-like structures may be producedrecombinantly. If produced in a recombinant host cell system, the AIsurface protein will preferably be expressed in coordination with theexpression of one or more of the AI sortases of the invention. Such AIsortases will facilitate oligomeric or hyperoligomeric formation of theAI surface protein subunits.

S. pneumoniae from TIGR4 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae from TIGR4. The S. pneumoniae from TIGR4Adhesin Island comprises a series of approximately seven open readingframes encoding for a collection of amino acid sequences comprisingsurface proteins and sortases. Specifically, the S. pneumoniae fromTIGR4 AI proteins includes open reading frames encoding for two or more(i.e., 2, 3, 4, 5, 6, or 7) of SP0462, SP0463, SP0464, SP0465, SP0466,SP0467, and SP0468.

A preferred immunogenic composition of the invention comprises a S.pneumoniae from TIGR4 AI surface protein which may be formulated orpurified in an oligomeric (pilis) form. In a preferred embodiment, theoligomeric form is a hyperoligomer. Another preferred immunogeniccomposition of the invention comprises a S. pneumoniae from TIGR4 AIsurface protein which has been isolated in an oligomeric (pilis) form.The oligomer or hyperoligomer pilus structures comprising S. pneumoniaesurface proteins may be purified or otherwise formulated for use inimmunogenic compositions.

One or more of the S. pneumoniae from TIGR4 AI open reading framepolynucleotide sequences may be replaced by a polynucleotide sequencecoding for a fragment of the replaced ORF. Alternatively, one or more ofthe S. pneumoniae from TIGR4 AI open reading frames may be replaced by asequence having sequence homology to the replaced ORF.

One or more of the S. pneumoniae from TIGR4 AI surface protein sequencestypically include an LPXTG motif (such as LPXTG (SEQ ID NO: 122)) orother sortase substrate motif.

The S. pneumoniae from TIGR4 AI surface proteins of the invention mayaffect the ability of the S. pneumoniae bacteria to adhere to and invadeepithelial cells. AI surface proteins may also affect the ability of S.pneumoniae to translocate through an epithelial cell layer. Preferably,one or more S. pneumoniae from TIGR4 AI surface proteins are capable ofbinding to or otherwise associating with an epithelial cell surface. S.pneumoniae from TIGR4 AI surface proteins may also be able to bind to orassociate with fibrinogen, fibronectin, or collagen.

The S. pneumoniae from TIGR4 AI sortase proteins are predicted to beinvolved in the secretion and anchoring of the LPXTG containing surfaceproteins. S. pneumoniae from TIGR4 AI may encode for at least onesurface protein. Alternatively, S. pneumoniae from TIGR4 AI may encodefor at least two surface exposed proteins and at least one sortase.Preferably, S. pneumoniae from TIGR4 AI encodes for at least threesurface exposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae from TIGR4AI surface protein such as SP0462, SP0463, SP0464, or SP0465. Theoligomeric, pilus-like structure may comprise numerous units of AIsurface protein. Preferably, the oligomeric, pilus-like structurescomprise two or more AI surface proteins. Still more preferably, theoligomeric, pilus-like structure comprises a hyper-oligomeric pilus-likestructure comprising at least two (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120,140, 150, 200 or more) oligomeric subunits, wherein each subunitcomprises an AI surface protein or a fragment thereof. The oligomericsubunits may be covalently associated via a conserved lysine within apilin motif. The oligomeric subunits may be covalently associated via anLPXTG motif, preferably, via the threonine or serine amino acid residue,respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae from TIGR4 AI protein in oligomeric form,preferably in a hyperoligomeric form. In one embodiment, the inventioncomprises a composition comprising one or more S. pneumoniae from TIGR4AI proteins and one or more S. pneumoniae strain 670 AI proteins,wherein one or more of the S. pneumoniae AI proteins is in the form ofan oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae fromTIGR4 AI proteins, S. pneumoniae from TIGR4 AI may also include atranscriptional regulator.

S. pneumoniae Strain 670 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 670. The S. pneumoniae strain 670Adhesin Island comprises a series of approximately seven open readingframes encoding for a collection of amino acid sequences comprisingsurface proteins and sortases. Specifically, the S. pneumoniae strain670 AI proteins includes open reading frames encoding for two or more(i.e., 2, 3, 4, 5, 6, or 7) of orf1_(—)670, orf3_(—)670, orf4_(—)670,orf5_(—)670, orf6_(—)670, orf7_(—)670, orf8_(—)670.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 670 AI surface protein which may be formulated orpurified in an oligomeric (pilis) form. Another preferred immunogeniccomposition of the invention comprises a S. pneumoniae strain 670 AIsurface protein which has been isolated in an oligomeric (pilis) form.

One or more of the S. pneumoniae strain 670 AI open reading framepolynucleotide sequences may be replaced by a polynucleotide sequencecoding for a fragment of the replaced ORF. Alternatively, one or more ofthe S. pneumoniae strain 670 AI open reading frames may be replaced by asequence having sequence homology to the replaced ORF.

One or more of the S. pneumoniae strain 670 AI surface protein sequencestypically include an LPXTG motif (such as LPXTG (SEQ ID NO: 122)) orother sortase substrate motif.

The S. pneumoniae strain 670 AI surface proteins of the invention mayaffect the ability of the S. pneumoniae bacteria to adhere to and invadeepithelial cells. AI surface proteins may also affect the ability of S.pneumoniae to translocate through an epithelial cell layer. Preferably,one or more S. pneumoniae strain 670 AI surface proteins are capable ofbinding to or otherwise associating with an epithelial cell surface. S.pneumoniae strain 670 AI surface proteins may also be able to bind to orassociate with fibrinogen, fibronectin, or collagen.

The S. pneumoniae strain 670 AI sortase proteins are predicted to beinvolved in the secretion and anchoring of the LPXTG containing surfaceproteins. S. pneumoniae strain 670 AI may encode for at least onesurface protein. Alternatively, S. pneumoniae strain 670 AI may encodefor at least two surface exposed proteins and at least one sortase.Preferably, S. pneumoniae strain 670 AI encodes for at least threesurface exposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 670AI surface protein such as orf3_(—)670, orf4_(—)670, or orf5_(—)670. Theoligomeric, pilus-like structure may comprise numerous units of AIsurface protein. Preferably, the oligomeric, pilus-like structurescomprise two or more AI surface proteins. Still more preferably, theoligomeric, pilus-like structure comprises a hyper-oligomeric pilus-likestructure comprising at least two (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120,140, 150, 200 or more) oligomeric subunits, wherein each subunitcomprises an AI surface protein or a fragment thereof. The oligomericsubunits may be covalently associated via a conserved lysine within apilin motif. The oligomeric subunits may be covalently associated via anLPXTG motif, preferably, via the threonine or serine amino acid residue,respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 670 AI protein in oligomeric form,preferably in a hyperoligomeric form. In one embodiment, the inventioncomprises a composition comprising one or more S. pneumoniae strain 670AI proteins and one or more S. pneumoniae from TIGR4 AI proteins,wherein one or more of the S. pneumoniae AI proteins is in the form ofan oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain670 AI proteins, S. pneumoniae strain 670 AI may also include atranscriptional regulator.

S. pneumoniae strain 14 CSR 10 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 14 CSR 10. The S. pneumoniae strain14 CSR 10 Adhesin Island comprises a series of approximately seven openreading frames encoding for a collection of amino acid sequencescomprising surface proteins and sortases. Specifically, the S.pneumoniae strain 14 CSR 10 AI proteins includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of ORF2_(—)14CSR,ORF3_(—)14CSR, ORF4_(—)14CSR, ORF5_(—)14CSR, ORF6_(—)14CSR,ORF7_(—)14CSR, ORF8_(—)14CSR.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 14 CSR 10 AI surface protein which may be formulatedor purified in an oligomeric (pilis) form. Another preferred immunogeniccomposition of the invention comprises a S. pneumoniae strain 14 CSR 10AI surface protein which has been isolated in an oligomeric (pilis)form.

One or more of the S. pneumoniae strain 14 CSR 10 AI open reading framepolynucleotide sequences may be replaced by a polynucleotide sequencecoding for a fragment of the replaced ORF. Alternatively, one or more ofthe S. pneumoniae strain 14 CSR 10 AI open reading frames may bereplaced by a sequence having sequence homology to the replaced ORF.

One or more of the S. pneumoniae strain 14 CSR 10 AI surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) or other sortase substrate motif.

The S. pneumoniae strain 14 CSR 10 AI surface proteins of the inventionmay affect the ability of the S. pneumoniae bacteria to adhere to andinvade epithelial cells. AI surface proteins may also affect the abilityof S. pneumoniae to translocate through an epithelial cell layer.Preferably, one or more S. pneumoniae strain 14 CSR 10 AI surfaceproteins are capable of binding to or otherwise associating with anepithelial cell surface. S. pneumoniae strain 14 CSR 10 AI surfaceproteins may also be able to bind to or associate with fibrinogen,fibronectin, or collagen.

The S. pneumoniae strain 14 CSR 10 AI sortase proteins are predicted tobe involved in the secretion and anchoring of the LPXTG containingsurface proteins. S. pneumoniae strain 14 CSR 10 AI may encode for atleast one surface protein. Alternatively, S. pneumoniae strain 14 CSR 10AI may encode for at least two surface exposed proteins and at least onesortase. Preferably, S. pneumoniae strain 14 CSR 10 AI encodes for atleast three surface exposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 14CSR 10 AI surface protein such as orf3_CSR, orf4_CSR, or orf5_CSR. Theoligomeric, pilus-like structure may comprise numerous units of AIsurface protein. Preferably, the oligomeric, pilus-like structurescomprise two or more AI surface proteins. Still more preferably, theoligomeric, pilus-like structure comprises a hyper-oligomeric pilus-likestructure comprising at least two (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120,140, 150, 200 or more) oligomeric subunits, wherein each subunitcomprises an AI surface protein or a fragment thereof. The oligomericsubunits may be covalently associated via a conserved lysine within apilin motif. The oligomeric subunits may be covalently associated via anLPXTG motif, preferably, via the threonine or serine amino acid residue,respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 14 CSR 10 AI protein in oligomericform, preferably in a hyperoligomeric form. In one embodiment, theinvention comprises a composition comprising one or more S. pneumoniaestrain 14 CSR 10 AI proteins, and one or more AI proteins of any of S.pneumoniae from TIGR4, 670, 19A Hungary 6, 6B Finland 12, 6B Spain 2, 9VSpain 3, 19F Taiwan 14, 23F Taiwan 15, or 23F Poland 16, wherein one ormore of the S. pneumoniae AI proteins is in the form of an oligomer,preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain14 CSR 10 AI proteins, S. pneumoniae strain 14 CSR 10 AI may alsoinclude a transcriptional regulator.

S. pneumoniae Strain 19A Hungary 6 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 19A Hungary 6. The S. pneumoniaestrain 19A Hungary 6 Adhesin Island comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases. Specifically, the S.pneumoniae strain 19A Hungary 6 AI proteins includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of ORF2_(—)19AH,ORF3_(—)19AH, ORF4_(—)19AH, ORF5_(—)1 gAH, ORF6_(—)1 gAH, ORF7_(—)19AH,ORF8_(—)19AH.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 19A Hungary 6 AI surface protein which may beformulated or purified in an oligomeric (pilis) form. Another preferredimmunogenic composition of the invention comprises a S. pneumoniaestrain 19A Hungary 6 AI surface protein which has been isolated in anoligomeric (pilis) form.

One or more of the S. pneumoniae strain 19A Hungary 6 AI open readingframe polynucleotide sequences may be replaced by a polynucleotidesequence coding for a fragment of the replaced ORF. Alternatively, oneor more of the S. pneumoniae strain 19A Hungary 6 AI open reading framesmay be replaced by a sequence having sequence homology to the replacedORF.

One or more of the S. pneumoniae strain 19A Hungary 6 AI surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) or other sortase substrate motif.

The S. pneumoniae strain 19A Hungary 6 AI surface proteins of theinvention may affect the ability of the S. pneumoniae bacteria to adhereto and invade epithelial cells. AI surface proteins may also affect theability of S. pneumoniae to translocate through an epithelial celllayer. Preferably, one or more S. pneumoniae strain 19A Hungary 6 AIsurface proteins are capable of binding to or otherwise associating withan epithelial cell surface. S. pneumoniae strain 19A Hungary 6 AIsurface proteins may also be able to bind to or associate withfibrinogen, fibronectin, or collagen.

The S. pneumoniae strain 19A Hungary 6 AI sortase proteins are predictedto be involved in the secretion and anchoring of the LPXTG containingsurface proteins. S. pneumoniae strain 19A Hungary 6 AI may encode forat least one surface protein. Alternatively, S. pneumoniae strain 19AHungary 6 AI may encode for at least two surface exposed proteins and atleast one sortase. Preferably, S. pneumoniae strain 19A Hungary 6 AIencodes for at least three surface exposed proteins and at least twosortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 19AHungary 6 AI surface protein such as orf3_(—)19AH, orf4_(—)19AH, ororf5_(—)19AH. The oligomeric, pilus-like structure may comprise numerousunits of AI surface protein. Preferably, the oligomeric, pilus-likestructures comprise two or more AI surface proteins. Still morepreferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineor serine amino acid residue, respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 19A Hungary 6 AI protein in oligomericform, preferably in a hyperoligomeric form. In one embodiment, theinvention comprises a composition comprising one or more S. pneumoniaestrain 19A Hungary 6 AI proteins and one or more AI proteins from one ofany one of S. pneumoniae from TIGR4, 670, 14 CSR 10, 6B Finland 12, 6BSpain 2, 9V Spain 3, 19F Taiwan 14, 23F Taiwan 15, or 23F Poland 16 AIGR4 AI proteins, wherein one or more of the S. pneumoniae AI proteins isin the form of an oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain19A Hungary 6 AI proteins, S. pneumoniae strain 19A Hungary 6 AI mayalso include a transcriptional regulator.

S. pneumoniae Strain 19F Taiwan 14 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 19F Taiwan 14. The S. pneumoniaestrain 19F Taiwan 14 Adhesin Island comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases. Specifically, the S.pneumoniae strain 19F Taiwan 14 AI proteins includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of ORF2_(—)19FTW,ORF3_(—)19FTW, ORF4_(—)19FTW, ORF5_(—)19FTW, ORF6_(—)19FTW,ORF7_(—)19FTW, ORF8_(—)19FTW.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 19F Taiwan 14 AI surface protein which may beformulated or purified in an oligomeric (pilis) form. Another preferredimmunogenic composition of the invention comprises a S. pneumoniaestrain 19F Taiwan 14 AI surface protein which has been isolated in anoligomeric (pilis) form.

One or more of the S. pneumoniae strain 19F Taiwan 14 AI open readingframe polynucleotide sequences may be replaced by a polynucleotidesequence coding for a fragment of the replaced ORF. Alternatively, oneor more of the S. pneumoniae strain 19F Taiwan 14 AI open reading framesmay be replaced by a sequence having sequence homology to the replacedORF.

One or more of the S. pneumoniae strain 19F Taiwan 14 AI surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) or other sortase substrate motif.

The S. pneumoniae strain 19F Taiwan 14 AI surface proteins of theinvention may affect the ability of the S. pneumoniae bacteria to adhereto and invade epithelial cells. AI surface proteins may also affect theability of S. pneumoniae to translocate through an epithelial celllayer. Preferably, one or more S. pneumoniae strain 19F Taiwan 14 AIsurface proteins are capable of binding to or otherwise associating withan epithelial cell surface. S. pneumoniae strain 19F Taiwan 14 AIsurface proteins may also be able to bind to or associate withfibrinogen, fibronectin, or collagen.

The S. pneumoniae strain 19F Taiwan 14 AI sortase proteins are predictedto be involved in the secretion and anchoring of the LPXTG containingsurface proteins. S. pneumoniae strain 19F Taiwan 14 μl may encode forat least one surface protein. Alternatively, S. pneumoniae strain 19FTaiwan 14 AI may encode for at least two surface exposed proteins and atleast one sortase. Preferably, S. pneumoniae strain 19F Taiwan 14 AIencodes for at least three surface exposed proteins and at least twosortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 19FTaiwan 14 AI surface protein such as orf3_(—)19FTW, orf4_(—)19FTW, ororf5_(—)19FTW. The oligomeric, pilus-like structure may comprisenumerous units of AI surface protein. Preferably, the oligomeric,pilus-like structures comprise two or more AI surface proteins. Stillmore preferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineor serine amino acid residue, respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 19F Taiwan 14 AI protein in oligomericform, preferably in a hyperoligomeric form. In one embodiment, theinvention comprises a composition comprising one or more S. pneumoniaestrain 19F Taiwan 14 AI proteins and one or more AI proteins of any oneor more of S. pneumoniae from TIGR4, 670, 19A Hungary 6, 6B Finland 12,6B Spain 2, 9V Spain 3, 14 CSR 10, 23F Taiwan 15, or 23F Poland 16,wherein one or more of the S. pneumoniae AI proteins is in the form ofan oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain19F Taiwan 14 AI proteins, S. pneumoniae strain 19F Taiwan 14 AI mayalso include a transcriptional regulator.

S. pneumoniae Strain 23F Poland 16 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 23F Poland 16. The S. pneumoniaestrain 23F Poland 16 Adhesin Island comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases. Specifically, the S.pneumoniae strain 23F Poland 16 AI proteins includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of ORF2_(—)23FP,ORF3_(—)23FP, ORF4_(—)23FP, ORF5_(—)23FP, ORF6_(—)23FP, ORF7_(—)23FP,and ORF8_(—)23FP.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 23F Poland 16 AI surface protein which may beformulated or purified in an oligomeric (pilis) form. Another preferredimmunogenic composition of the invention comprises a S. pneumoniaestrain 23F Poland 16 AI surface protein which has been isolated in anoligomeric (pilis) form.

One or more of the S. pneumoniae strain 23F Poland 16 AI open readingframe polynucleotide sequences may be replaced by a polynucleotidesequence coding for a fragment of the replaced ORF. Alternatively, oneor more of the S. pneumoniae strain 23F Poland 16 AI open reading framesmay be replaced by a sequence having sequence homology to the replacedORF.

One or more of the S. pneumoniae strain 23F Poland 16 AI surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) or other sortase substrate motif.

The S. pneumoniae strain 23F Poland 16 AI surface proteins of theinvention may affect the ability of the S. pneumoniae bacteria to adhereto and invade epithelial cells. AI surface proteins may also affect theability of S. pneumoniae to translocate through an epithelial celllayer. Preferably, one or more S. pneumoniae strain 23F Poland 16 AIsurface proteins are capable of binding to or otherwise associating withan epithelial cell surface. S. pneumoniae strain 23F Poland 16 AIsurface proteins may also be able to bind to or associate withfibrinogen, fibronectin, or collagen.

The S. pneumoniae strain 23F Poland 16 AI sortase proteins are predictedto be involved in the secretion and anchoring of the LPXTG containingsurface proteins. S. pneumoniae strain 23F Poland 16 AI may encode forat least one surface protein. Alternatively, S. pneumoniae strain 23FPoland 16 AI may encode for at least two surface exposed proteins and atleast one sortase. Preferably, S. pneumoniae strain 23F Poland 16 AIencodes for at least three surface exposed proteins and at least twosortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 23FPoland 16 AI surface protein such as orf3_(—)23FP, orf4_(—)23FP, ororf5_(—)23FP. The oligomeric, pilus-like structure may comprise numerousunits of AI surface protein. Preferably, the oligomeric, pilus-likestructures comprise two or more AI surface proteins. Still morepreferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineor serine amino acid residue, respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 23F Poland 16 AI protein in oligomericform, preferably in a hyperoligomeric form. In one embodiment, theinvention comprises a composition comprising one or more S. pneumoniaestrain 23F Poland 16 AI proteins and one or more AI proteins from anyone or more S. pneumoniae strains of TIGR4, 670, 19A Hungary 6, 6BFinland 12, 6B Spain 2, 9V Spain 3, 19F Taiwan 14, 23F Taiwan 15, or 14CSR 10, wherein one or more of the S. pneumoniae AI proteins is in theform of an oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain23F Poland 16 AI proteins, S. pneumoniae strain 23F Poland 16 AI mayalso include a transcriptional regulator.

S. pneumoniae Strain 23F Taiwan 15 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 23F Taiwan 15. The S. pneumoniaestrain 23F Taiwan 15 Adhesin Island comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases. Specifically, the S.pneumoniae strain 23F Taiwan 15 AI proteins includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of ORF2_(—)23FTW,ORF3_(—)23FTW, ORF4_(—)23FTW, ORF5_(—)23FTW, ORF6_(—)23FTW,ORF7_(—)23F[W, ORF8_(—)23FTW.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 23F Taiwan 15 AI surface protein which may beformulated or purified in an oligomeric (pilis) form. Another preferredimmunogenic composition of the invention comprises a S. pneumoniaestrain 23F Taiwan 15 AI surface protein which has been isolated in anoligomeric (pilis) form.

One or more of the S. pneumoniae strain 23F Taiwan 15 AI open readingframe polynucleotide sequences may be replaced by a polynucleotidesequence coding for a fragment of the replaced ORF. Alternatively, oneor more of the S. pneumoniae strain 23F Taiwan 15 AI open reading framesmay be replaced by a sequence having sequence homology to the replacedORF.

One or more of the S. pneumoniae strain 23F Taiwan 15 AI surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) or other sortase substrate motif.

The S. pneumoniae strain 23F Taiwan 15 AI surface proteins of theinvention may affect the ability of the S. pneumoniae bacteria to adhereto and invade epithelial cells. AI surface proteins may also affect theability of S. pneumoniae to translocate through an epithelial celllayer. Preferably, one or more S. pneumoniae strain 23F Taiwan 15 AIsurface proteins are capable of binding to or otherwise associating withan epithelial cell surface. S. pneumoniae strain 23F Taiwan 15 AIsurface proteins may also be able to bind to or associate withfibrinogen, fibronectin, or collagen.

The S. pneumoniae strain 23F Taiwan 15 AI sortase proteins are predictedto be involved in the secretion and anchoring of the LPXTG containingsurface proteins. S. pneumoniae strain 23F Taiwan 15 AI may encode forat least one surface protein. Alternatively, S. pneumoniae strain 23FTaiwan 15 AI may encode for at least two surface exposed proteins and atleast one sortase. Preferably, S. pneumoniae strain 23F Taiwan 15 AIencodes for at least three surface exposed proteins and at least twosortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 23FTaiwan 15 AI surface protein such as orf3_(—)23FTW, orf4_(—)23FTW, ororf5_(—)23FTW. The oligomeric, pilus-like structure may comprisenumerous units of AI surface protein. Preferably, the oligomeric,pilus-like structures comprise two or more AI surface proteins. Stillmore preferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineor serine amino acid residue, respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 23F Taiwan 15 AI protein in oligomericform, preferably in a hyperoligomeric form. In one embodiment, theinvention comprises a composition comprising one or more S. pneumoniaestrain 23F Taiwan 15 AI proteins and one or more AI proteins from anyone or more of S. pneumoniae from TIGR4, 670, 19A Hungary 6, 6B Finland12, 6B Spain 2, 9V Spain 3, 19F Taiwan 14, 14 CSR 10, or 23F Poland 16AI, wherein one or more of the S. pneumoniae AI proteins is in the formof an oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain23F Taiwan 15 AI proteins, S. pneumoniae strain 23F Taiwan 15 AI mayalso include a transcriptional regulator.

S. pneumoniae Strain 6B Finland 12 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 6B Finland 12. The S. pneumoniaestrain 6B Finland 12 Adhesin Island comprises a series of approximatelyseven open reading frames encoding for a collection of amino acidsequences comprising surface proteins and sortases. Specifically, the S.pneumoniae strain 6B Finland 12 AI proteins includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of ORF2_(—)6BF,ORF3_(—)6BF, ORF4_(—)6BF, ORF5_(—)6BF, ORF6_(—)6BF, ORF7_(—)6BF,ORF8_(—)6BF.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 6B Finland 12 AI surface protein which may beformulated or purified in an oligomeric (pilis) form. Another preferredimmunogenic composition of the invention comprises a S. pneumoniaestrain 6B Finland 12 AI surface protein which has been isolated in anoligomeric (pilis) form.

One or more of the S. pneumoniae strain 6B Finland 12 AI open readingframe polynucleotide sequences may be replaced by a polynucleotidesequence coding for a fragment of the replaced ORF. Alternatively, oneor more of the S. pneumoniae strain 6B Finland 12 AI open reading framesmay be replaced by a sequence having sequence homology to the replacedORF.

One or more of the S. pneumoniae strain 6B Finland 12 AI surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) or other sortase substrate motif.

The S. pneumoniae strain 6B Finland 12 AI surface proteins of theinvention may affect the ability of the S. pneumoniae bacteria to adhereto and invade epithelial cells. AI surface proteins may also affect theability of S. pneumoniae to translocate through an epithelial celllayer. Preferably, one or more S. pneumoniae strain 6B Finland 12 AIsurface proteins are capable of binding to or otherwise associating withan epithelial cell surface. S. pneumoniae strain 6B Finland 12 AIsurface proteins may also be able to bind to or associate withfibrinogen, fibronectin, or collagen.

The S. pneumoniae strain 6B Finland 12 AI sortase proteins are predictedto be involved in the secretion and anchoring of the LPXTG containingsurface proteins. S. pneumoniae strain 6B Finland 12 AI may encode forat least one surface protein. Alternatively, S. pneumoniae strain 6BFinland 12 AI may encode for at least two surface exposed proteins andat least one sortase. Preferably, S. pneumoniae strain 6B Finland 12 AIencodes for at least three surface exposed proteins and at least twosortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 6BFinland 12 AI surface protein such as orf3_(—)6BF, orf4_(—)6BF, ororf5_(—)6BF. The oligomeric, pilus-like structure may comprise numerousunits of AI surface protein. Preferably, the oligomeric, pilus-likestructures comprise two or more AI surface proteins. Still morepreferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineor serine amino acid residue, respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 6B Finland 12 AI protein in oligomericform, preferably in a hyperoligomeric form. In one embodiment, theinvention comprises a composition comprising one or more S. pneumoniaestrain 6B Finland 12 AI proteins and one or more AI proteins of any oneor more of S. pneumoniae from TIGR4, 670, 19A Hungary 6, 6B Finland 12,6B Spain 2, 9V Spain 3, 19F Taiwan 14, 23F Taiwan 15, or 23F Poland 16AI, wherein one or more of the S. pneumoniae AI proteins is in the formof an oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain6B Finland 12 AI proteins, S. pneumoniae strain 6B Finland 12 AI mayalso include a transcriptional regulator.

S. pneumoniae Strain 6B Spain 2 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 6B Spain 2. The S. pneumoniae strain6B Spain 2 Adhesin Island comprises a series of approximately seven openreading frames encoding for a collection of amino acid sequencescomprising surface proteins and sortases. Specifically, the S.pneumoniae strain 6B Spain 2 AI proteins includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of ORF2_(—)6BSP,ORF3_(—)6BSP, ORF4_(—)6BSP, ORF5_(—)6BSP, ORF6_(—)6BSP, ORF7_(—)6BSP,and ORF8_(—)6BSP.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 6B Spain 2 AI surface protein which may be formulatedor purified in an oligomeric (pilis) form. Another preferred immunogeniccomposition of the invention comprises a S. pneumoniae strain 6B Spain 2AI surface protein which has been isolated in an oligomeric (pilis)form.

One or more of the S. pneumoniae strain 6B Spain 2 AI open reading framepolynucleotide sequences may be replaced by a polynucleotide sequencecoding for a fragment of the replaced ORF. Alternatively, one or more ofthe S. pneumoniae strain 6B Spain 2 AI open reading frames may bereplaced by a sequence having sequence homology to the replaced ORF.

One or more of the S. pneumoniae strain 6B Spain 2 AI surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) or other sortase substrate motif.

The S. pneumoniae strain 6B Spain 2 AI surface proteins of the inventionmay affect the ability of the S. pneumoniae bacteria to adhere to andinvade epithelial cells. AI surface proteins may also affect the abilityof S. pneumoniae to translocate through an epithelial cell layer.Preferably, one or more S. pneumoniae strain 6B Spain 2 AI surfaceproteins are capable of binding to or otherwise associating with anepithelial cell surface. S. pneumoniae strain 6B Spain 2 AI surfaceproteins may also be able to bind to or associate with fibrinogen,fibronectin, or collagen.

The S. pneumoniae strain 6B Spain 2 AI sortase proteins are predicted tobe involved in the secretion and anchoring of the LPXTG containingsurface proteins. S. pneumoniae strain 6B Spain 2 AI may encode for atleast one surface protein. Alternatively, S. pneumoniae strain 6B Spain2 AI may encode for at least two surface exposed proteins and at leastone sortase. Preferably, S. pneumoniae strain 6B Spain 2 AI encodes forat least three surface exposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 6BSpain 2 AI surface protein such as orf3_(—)6BSP, orf4_(—)6BSP, ororf5_(—)6BSP. The oligomeric, pilus-like structure may comprise numerousunits of AI surface protein. Preferably, the oligomeric, pilus-likestructures comprise two or more AI surface proteins. Still morepreferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineor serine amino acid residue, respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 6B Spain 2 AI protein in oligomericform, preferably in a hyperoligomeric form. In one embodiment, theinvention comprises a composition comprising one or more S. pneumoniaestrain 6B Spain 2 AI proteins and one or more AI proteins of any one ormore of S. pneumoniae from TIGR4, 670, 19A Hungary 6, 6B Finland 12, 14CSR 10, 9V Spain 3, 19F Taiwan 14, 23F Taiwan 15, or 23F Poland 16 AI,wherein one or more of the S. pneumoniae AI proteins is in the form ofan oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain6B Spain 2 AI proteins, S. pneumoniae strain 6B Spain 2 AI may alsoinclude a transcriptional regulator.

S. pneumoniae Strain 9V Spain 3 Adhesin Island

As discussed above, Applicants have identified adhesin islands withinthe genome of S. pneumoniae strain 9V Spain 3. The S. pneumoniae strain9V Spain 3 Adhesin Island comprises a series of approximately seven openreading frames encoding for a collection of amino acid sequencescomprising surface proteins and sortases. Specifically, the S.pneumoniae strain 9V Spain 3 AI proteins includes open reading framesencoding for two or more (i.e., 2, 3, 4, 5, 6, or 7) of ORF2_(—)9VSP,ORF3_(—)9VSP, ORF4_(—)9VSP, ORF5_(—)9VSP, ORF6_(—)9VSP, ORF7_(—)9VSP,and ORF8_(—)9VSP.

A preferred immunogenic composition of the invention comprises a S.pneumoniae strain 9V Spain 3 AI surface protein which may be formulatedor purified in an oligomeric (pilis) form. Another preferred immunogeniccomposition of the invention comprises a S. pneumoniae strain 9V Spain 3AI surface protein which has been isolated in an oligomeric (pilis)form.

One or more of the S. pneumoniae strain 9V Spain 3 AI open reading framepolynucleotide sequences may be replaced by a polynucleotide sequencecoding for a fragment of the replaced ORF. Alternatively, one or more ofthe S. pneumoniae strain 9V Spain 3 AI open reading frames may bereplaced by a sequence having sequence homology to the replaced ORF.

One or more of the S. pneumoniae strain 9V Spain 3 AI surface proteinsequences typically include an LPXTG motif (such as LPXTG (SEQ ID NO:122)) or other sortase substrate motif.

The S. pneumoniae strain 9V Spain 3 AI surface proteins of the inventionmay affect the ability of the S. pneumoniae bacteria to adhere to andinvade epithelial cells. AI surface proteins may also affect the abilityof S. pneumoniae to translocate through an epithelial cell layer.Preferably, one or more S. pneumoniae strain 9V Spain 3 AI surfaceproteins are capable of binding to or otherwise associating with anepithelial cell surface. S. pneumoniae strain 9V Spain 3 AI surfaceproteins may also be able to bind to or associate with fibrinogen,fibronectin, or collagen.

The S. pneumoniae strain 9V Spain 3 AI sortase proteins are predicted tobe involved in the secretion and anchoring of the LPXTG containingsurface proteins. S. pneumoniae strain 9V Spain 3 AI may encode for atleast one surface protein. Alternatively, S. pneumoniae strain 9V Spain3 AI may encode for at least two surface exposed proteins and at leastone sortase. Preferably, S. pneumoniae strain 9V Spain 3 AI encodes forat least three surface exposed proteins and at least two sortases.

The AI surface proteins may be covalently attached to the bacterial cellwall by membrane-associated transpeptidases, such as an AI sortase. Thesortase may function to cleave the surface protein, preferably betweenthe threonine and glycine residues of an LPXTG motif. The sortase maythen assist in the formation of an amide link between the threoninecarboxyl group and a cell wall precursor such as lipid II. The precursorcan then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a S. pneumoniae strain 9VSpain 3 AI surface protein such as orf3_(—)9VSP, orf4_(—)9VSP, ororf5_(—)9VSP. The oligomeric, pilus-like structure may comprise numerousunits of AI surface protein. Preferably, the oligomeric, pilus-likestructures comprise two or more AI surface proteins. Still morepreferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineor serine amino acid residue, respectively.

AI surface proteins or fragments thereof to be incorporated into theoligomeric, pilus-like structures of the invention will preferablyinclude a pilin motif.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a S. pneumoniae strain 9V Spain 3 AI protein in oligomericform, preferably in a hyperoligomeric form. In one embodiment, theinvention comprises a composition comprising one or more S. pneumoniaestrain 9V Spain 3 AI proteins and one or more AI proteins from any oneor more of S. pneumoniae from TIGR4, 670, 19A Hungary 6, 6B Finland 12,6B Spain 2, 14 CSR 10, 19F Taiwan 14, 23F Taiwan 15, or 23F Poland 16AI, wherein one or more of the S. pneumoniae AI proteins is in the formof an oligomer, preferably in a hyperoligomeric form.

In addition to the open reading frames encoding the S. pneumoniae strain9V Spain 3 AI proteins, S. pneumoniae strain 9V Spain 3 AI may alsoinclude a transcriptional regulator.

The S. pneumoniae oligomeric, pilus-like structures may be isolated orpurified from bacterial cultures in which the bacteria express an S.pneumoniae AI surface protein. The invention therefore includes a methodfor manufacturing an oligomeric AI surface antigen comprising culturinga S. pneumoniae bacterium that expresses the oligomeric AI protein andisolating the expressed oligomeric AI protein from the S. pneumoniaebacteria. The AI protein may be collected from secretions into thesupernatant or it may be purified from the bacterial surface. The methodmay further comprise purification of the expressed AI protein.Preferably, the AI protein is in a hyperoligomeric form.

The oligomeric, pilus-like structures may be isolated or purified frombacterial cultures overexpressing an AI surface protein. The inventiontherefore includes a method for manufacturing an S. pneumoniaeoligomeric Adhesin Island surface antigen comprising culturing a S.pneumoniae bacterium adapted for increased AI protein expression andisolation of the expressed oligomeric Adhesin Island protein from the S.pneumoniae bacteria. The AI protein may be collected from secretionsinto the supernatant or it may be purified from the bacterial surface.The method may further comprise purification of the expressed AdhesinIsland protein. Preferably, the Adhesin Island protein is in ahyperoligomeric form.

The S. pneumoniae bacteria are preferably adapted to increase AI proteinexpression by at least two (e.g., 2, 3, 4, 5, 8, 10, 15, 20, 25, 30, 35,40, 45, 50, 60, 70, 80, 90, 100, 125, 150 or 200) times wild typeexpression levels.

S. pneumoniae bacteria may be adapted to increase AI protein expressionby any means known in the art, including methods of increasing genedosage and methods of gene upregulation. Such means include, forexample, transformation of the S. pneumoniae bacteria with a plasmidencoding the AI protein. The plasmid may include a strong promoter or itmay include multiple copies of the sequence encoding the AI protein.Optionally, the sequence encoding the AI protein within the S.pneumoniae bacterial genome may be deleted. Alternatively, or inaddition, the promoter regulating the S. pneumoniae Adhesin Island maybe modified to increase expression.

The invention further includes S. pneumoniae bacteria which have beenadapted to produce increased levels of AI surface protein. Inparticular, the invention includes S. pneumoniae bacteria which havebeen adapted to produce oligomeric or hyperoligomeric AI surfaceprotein. In one embodiment, the S. pneumoniae of the invention areinactivated or attenuated to permit in vivo delivery of the wholebacteria, with the AI surface protein exposed on its surface.

The invention further includes S. pneumoniae bacteria which have beenadapted to have increased levels of expressed AI protein incorporated inpili on their surface. The S. pneumoniae bacteria may be adapted to haveincreased exposure of oligomeric or hyperoligomeric AI proteins on itssurface by increasing expression levels of a signal peptidasepolypeptide. Increased levels of a local signal peptidase expression inGram positive bacteria (such us LepA in GAS) are expected to result inincreased exposure of pili proteins on the surface of Gram positivebacteria. Increased expression of a leader peptidase in S. pneumoniaemay be achieved by any means known in the art, such as increasing genedosage and methods of gene upregulation. The S. pneumoniae bacteriaadapted to have increased levels of leader peptidase may additionally beadapted to express increased levels of at least one pili protein.

Alternatively, the AI proteins of the invention may be expressed on thesurface of a non-pathogenic Gram positive bacteria, such as Streptococusgordonii (See, e.g., Byrd et al., “Biological consequences of antigenand cytokine co-expression by recombinant Streptococcus gordonii vaccinevectors”, Vaccine (2002) 20:2197-2205) or Lactococcus lactis (See, e.g.,Mannam et al., “Mucosal Vaccine Made from Live, Recombinant Lactococcuslactis Protects Mice against Pharangeal Infection with Streptococcuspyogenes” Infection and Immunity (2004) 72(6):3444-3450). As usedherein, non-pathogenic Gram positive bacteria refer to Gram positivebacteria which are compatible with a human host subject and are notassociated with human pathogenisis. Preferably, the non-pathogenicbacteria are modified to express the AI surface protein in oligomeric,or hyper-oligomeric form. Sequences encoding for an AI surface proteinand, optionally, an AI sortase, may be integrated into thenon-pathogenic Gram positive bacterial genome or inserted into aplasmid. The non-pathogenic Gram positive bacteria may be inactivated orattenuated to facilitate in vivo delivery of the whole bacteria, withthe AI surface protein exposed on its surface. Alternatively, the AIsurface protein may be isolated or purified from a bacterial culture ofthe non-pathogenic Gram positive bacteria. For example, the AI surfaceprotein may be isolated from cell extracts or culture supernatants.Alternatively, the AI surface protein may be isolated or purified fromthe surface of the non-pathogenic Gram positive bacteria.

The non-pathogenic Gram positive bacteria may be used to express any ofthe S. pneumoniae Adhesin Island proteins described herein. Thenon-pathogenic Gram positive bacteria are transformed to express anAdhesin Island surface protein. Preferably, the non-pathogenic Grampositive bacteria also express at least one Adhesin Island sortase. TheAI transformed non-pathogenic Gram positive bacteria of the inventionmay be used to prevent or treat infection with pathogenic S. pneumoniae.

FIGS. 190 A and B, and 193-195 provide examples of three methodssuccessfully practiced by applicants to purify pili from S. pneumoniaeTIGR4.

Immunogenic Compositions

The Gram positive bacteria AI proteins described herein are useful inimmunogenic compositions for the prevention or treatment of Grampositive bacterial infection. For example, the GBS AI surface proteinsdescribed herein are useful in immunogenic compositions for theprevention or treatment of GBS infection. As another example, the GAS AIsurface proteins described herein may be useful in immunogeniccompositions for the prevention or treatment of GAS infection. Asanother example, the S. pneumoniae AI surface proteins may be useful inimmunogenic cojmpositions for the prevention or treatment of S.pneumoniae infection.

Gram positive bacteria AI surface proteins that can provide protectionacross more than one serotype or strain isolate may be used to increaseimmunogenic effectiveness. For example, a particular GBS AI surfaceprotein having an amino acid sequence that is at least 50% (i.e., atleast 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%,or 100%) homologous to the particular GBS AI surface protein of at least2 (i.e., at least 3, 4, 5, 6, 7, 8, 9, 10, or more) other GBS serotypesor strain isolates may be used to increase the effectiveness of suchcompositions.

As another example, fragments of Gram positive bacteria AI surfaceproteins that can provide protection across more than one serotype orstrain isolate may be used to increase immunogenic effectiveness. Such afragment may be identified within a consensus sequence of a full lengthamino acid sequence of a Gram positive bacteria AI surface protein. Sucha fragment can be identified in the consensus sequence by its highdegree of homology or identity across multiple (i.e, at least 3, 4, 5,6, 7, 8, 9, 10, or more) Gram positive bacteria serotypes or strainisolates. Preferably, a high degree of homology is a degree of homologyof at least 90% (i.e., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,98%, 99%, or 100%) across Gram positive bacteria serotypes or strainisolates. Preferably, a high degree of identity is a degree of identityof at least 90% (i.e., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,98%, 99%, or 100%) across Gram positive bacteria serotypes or strainisolates. In one embodiment of the invention, such a fragment of a Grampositive bacteria AI surface protein may be used in the immunogeniccompositions.

In addition, the AI surface protein oligomeric pilus structures may beformulated or purified for use in immunization. Isolated AI surfaceprotein oligomeric pilus structures may also be used for immunization.

The invention includes an immunogenic composition comprising a firstGram positive bacteria AI protein and a second Gram positive bacterialAI protein. One or more of the AI proteins may be a surface protein.Such surface proteins may contain an LPXTG motif or other sortasesubstrate motif.

The first and second AI proteins may be from the same or different genusor species of Gram positive bacteria. If within the same species, thefirst and second AI proteins may be from the same or different AIsubtypes. If two AIs are of the same subtype, the AIs have the samenumerical designation. For example, all AIs designated as AI-1 are ofthe same AI subtype. If two AIs are of a different subtype, the AIs havedifferent numerical designations. For example, AI-1 is of a different AIsubtype from AI-2, AI-3, AI-4, etc. Likewise, AI-2 is of a different AIsubtype from AI-1, AI-3, and AI-4, etc.

For example, the invention includes an immunogenic compositioncomprising one or more GBS AI-1 proteins and one or more GBS AI-2proteins. One or more of the AI proteins may be a surface protein. Suchsurface proteins may contain an LPXTG motif (such as LPXTG (SEQ ID NO:122)) and may bind fibrinogen, fibronectin, or collagen. One or more ofthe AI proteins may be a sortase. The GBS AI-1 proteins may be selectedfrom the group consisting of GBS 80, GBS 104, GBS 52, SAG0647 andSAG0648. Preferably, the GBS AI-1 proteins include GBS 80 or GBS 104.

The GBS AI-2 proteins may be selected from the group consisting of GBS67, GBS 59, GBS 150, SAG1405, SAG1406, 01520, 01521, 01522, 01523,01523, 01524 and 01525. In one embodiment, the GBS AI-2 proteins areselected from the group consisting of GBS 67, GBS 59, GBS 150, SAG1405,and SAG1406. In another embodiment, the GBS AI-2 proteins may beselected from the group consisting of 01520, 01521, 01522, 01523, 01523,01524 and 01525. Preferably, the GBS AI-2 protein includes GBS 59 or GBS67.

As another example, the invention includes an immunogenic compositioncomprising one or more of any combination of GAS AI-1, GAS AI-2, GASAI-3, or GAS AI-4 proteins. One or more of the GAS AI proteins may be asortase. The GAS AI-1 proteins may be selected from the group consistingof M6_Spy0156, M6_Spy0157, M6_Spy0158, M6_Spy0159, M6_Spy0160,M6_Spy0161, CDC SS 410_fimbrial, ISS3650_fimbrial, and DSM2071_fimbrial.Preferably, the GAS AI-1 proteins are selected from the group consistingof M6_Spy0157, M6_Spy0159, M6_Spy0160, CDC SS 410_fimbrial,ISS3650_fimbrial, and DSM2071_fimbrial.

The GAS AI-2 proteins may be selected from the group consisting ofSpy0124, GAS15, Spy0127, GAS16, GAS17, GAS18, Spy0131, Spy0133, andGAS20. Preferably, the GAS AI-2 proteins are selected from the groupconsisting of GAS 15, GAS 16, and GAS 18.

The GAS AI-3 proteins may be selected from the group consisting ofSpyM3_(—)0097, SpyM3_(—)0098, SpyM3_(—)0099, SpyM3_(—)0100,SpyM3_(—)0101, SpyM3_(—)0102, SpyM3_(—)0103, SpyM3_(—)0104, SPs0099,SPs0100, SPs0101, SPs0102, SPs0103, SPs0104, SPs0105, SPs0106, orf77,orf78, orf79, orf80, orf81, orf82, orf83, orf84, spyM18_(—)0125,spyM18_(—)0126, spyM18_(—)0127, spyM18_(—)0128, spyM18_(—)0129,spyM18_(—)0130, spyM18_(—)0131, spyM18_(—)0132, SpyoM01000156,SpyoM01000155, SpyoM01000154, SpyoM01000153, SpyoM01000152,SpyoM01000151, SpyoM01000150, SpyoM01000149, ISS3040_fimbrial,ISS3776_fimbrial, and ISS4959_fimbrial. In one embodiment the GAS AI-3proteins are selected from the group consisting of SpyM3_(—)0097,SpyM3_(—)0098, SpyM3_(—)0099, SpyM3_(—)0100, SpyM3_(—)0101,SpyM3_(—)0102, SpyM3_(—)0103, and SpyM3_(—)0104. In another embodiment,the GAS AI-3 proteins are selected from the group consisting of SPs0099,SPs0100, SPs0101, SPs0102, SPs0103, SPs0104, SPs0105, and SPs0106. Inyet another embodiment, the GAS AI-3 proteins are selected from thegroup consisting of orf77, orf78, orf79, orf80, orf81, orf82, orf83, andorf84. In a further embodiment, the GAS AI-3 proteins are selected fromthe group consisting of spyM18_(—)0125, spyM18_(—)0126, spyM18_(—)0127,spyM18_(—)0128, spyM18_(—)0129, spyM18_(—)0130, spyM18_(—)0131, andspyM18_(—)0132. In yet another embodiment the GAS AI-3 proteins areselected from the group consisting of SpyoM01000156, SpyoM01000155,SpyoM01000154, SpyoM01000153, SpyoM01000152, SpyoM01000151,SpyoM01000150, and SpyoM01000149.

The GAS AI-4 proteins may be selected from the group consisting of19224133, 19224134, 19224135, 19224136, 19224137, 19224138, 19224139,19224140, 19224141, 20010296_fimbrial, 20020069_fimbrial, CDC SS635_fimbrial, ISS4883_fimbrial, and ISS4538_fimbrial. Preferably, theGAS-AI-4 proteins are selected from the group consisting of 19224134,19224135, 19224137, 19224139, 19224141, 20010296_fimbrial,20020069_fimbrial, CDC SS 635_fimbrial, ISS4883_fimbrial, andISS4538_fimbrial.

As yet another example, the invention includes an immunogeniccomposition comprising one or more of any combination of S. pneumonaiefrom TIGR4, S. pneumonaie strain 670, S. pneumonaie from 19A Hungary 6,S. pneumonaie from 6B Finland 12, S. pneumonaie from 6B Spain 2, S.pneumonaie from 9V Spain 3, S. pneumonaie from 14 CSR 10, S. pneumonaiefrom 19F Taiwan 14, S. pneumonaie from 23F Taiwan 15, or S. pneumonaiefrom 23F Poland 16 AI proteins. One or more of the AI proteins may be asurface protein. Such surface proteins may contain an LPXTG motif (suchas LPXTG (SEQ ID NO: 122)) and may bind fibrinogen, fibronectin, orcollagen. One or more of the AI proteins may be a sortase.

The S. pneumonaie from TIGR4 AI proteins may be selected from the groupconsisting of SP0462, SP0463, SP0464, SP0465, SP0466, SP0467, SP0468.Preferably, the S. pneumonaie from TIGR4 AI proteins include SP0462,SP0463, or SP0464.

The S. pneumonaie strain 670 AI proteins may be selected from the groupconsisting of Orf1_(—)670, Orf3_(—)670, Orf4_(—)670, Orf5_(—)670,Orf6_(—)670, Orf7_(—)670, and Orf8_(—)670. Preferably, the S. pneumonaiestrain 670 AI proteins include Orf3_(—)670, Orf4_(—)670, or Orf5_(—)670.

The S. pneumonaie from 19A Hungary 6 AI proteins may be selected fromthe group consisting of ORF2_(—)1 gAH, ORF3_(—)1 gAH, ORF4_(—)19AH,ORF5_(—)19AH, ORF6_(—)1 gAH, ORF7_(—)19AH, or ORF8_(—)19AH.

The S. pneumonaie from 6B Finland 12 AI proteins may be selected fromthe group consisting of ORF2_(—)6BF, ORF3_(—)6BF, ORF4_(—)6BF,ORF5_(—)6BF, ORF6_(—)6BF, ORF7_(—)6BF, or ORF8_(—)6BF.

The S. pneumonaie from 6B Spain 2 AI proteins may be selected from thegroup consisting of ORF2_(—)6BSP, ORF3_(—)6BSP, ORF4_(—)6BSP,ORF5_(—)6BSP, ORF6_(—)6BSP, ORF7_(—)6BSP, or ORF8_(—)6BSP.

The S. pneumonaie from 9V Spain 3 AI proteins may be selected from thegroup consisting of ORF2_(—)9VSP, ORF3_(—)9VSP, ORF4_(—)9VSP,ORF5_(—)9VSP, ORF6_(—)9VSP, ORF7_(—)9VSP, or ORF8_(—)9VSP.

The S. pneumonaie from 14 CSR 10 AI proteins may be selected from thegroup consisting of ORF2_(—)14CSR, ORF3_(—)14CSR, ORF4_(—)14CSR,ORF5_(—)14CSR, ORF6_(—)14CSR, ORF7_(—)14CSR, or ORF8_(—)14CSR.

The S. pneumonaie from 19F Taiwan 14 AI proteins may be selected fromthe group consisting of ORF2_(—)1 gFTW, ORF3_(—)19FTW, ORF4_(—)19FTW,ORF5_(—)19FTW, ORF6_(—)19FTW, ORF7_(—)19FTW, or ORF8_(—)19FTW.

The S. pneumonaie from 23F Taiwan 15 AI proteins may be selected fromthe group consisting of ORF2_(—)23FTW, ORF3_(—)23FTW, ORF4_(—)23FTW,ORF5_(—)23FTW, ORF6_(—)23FTW, ORF7_(—)23FTW, or ORF8_(—)23FTW.

The S. pneumonaie from 23F Poland 16 AI proteins may be selected fromthe group consisting of ORF2_(—)23FP, ORF3_(—)23FP, ORF4_(—)23FP,ORF5_(—)23FP, ORF6_(—)23FP, ORF7_(—)23FP or ORF8_(—)23FP.

Preferably, the Gram positive bacteria AI proteins included in theimmunogenic compositions of the invention can provide protection acrossmore than one serotype or strain isolate. For example, the immunogeniccomposition may comprise a first AI protein, wherein the amino acidsequence of said AI protein is at least 90% (i.e., at least 90, 91, 92,93, 94, 95, 96, 97, 98, 99 or 100%) homologous to the amino acidsequence of a second AI protein, and wherein said first AI protein andsaid second AI protein are derived from the genomes of differentserotypes of a Gram positive bacteria. The first AI protein may also behomologous to the amino acid sequence of a third AI protein, such thatthe first AI protein, the second AI protein and the third AI protein arederived from the genomes of different serotypes of a Gram positivebacteria. The first AI protein may also be homologous to the amino acidsequence of a fourth AI protein, such that the first AI protein, thesecond AI protein and the third AI protein are derived from the genomesof different serotypes of a Gram positive bacteria.

For example, preferably, the GBS AI proteins included in the immunogeniccompositions of the invention can provide protection across more thanone GBS serotype or strain isolate. For example, the immunogeniccomposition may comprise a first GBS AI protein, wherein the amino acidsequence of said AI protein is at least 90% (i.e., at least 90, 91, 92,93, 94, 95, 96, 97, 98, 99 or 100%) homologous to the amino acidsequence of a second GBS AI protein, and wherein said first AI proteinand said second AI protein are derived from the genomes of different GBSserotypes. The first GBS AI protein may also be homologous to the aminoacid sequence of a third GBS AI protein, such that the first AI protein,the second AI protein and the third AI protein are derived from thegenomes of different GBS serotypes. The first AI protein may also behomologous to the amino acid sequence of a fourth GBS AI protein, suchthat the first AI protein, the second AI protein and the third AIprotein are derived from the genomes of different GBS serotypes.

The first AI protein may be selected from an AI-1 protein or an AI-2protein. For example, the first AI protein may be a GBS AI-1 surfaceprotein such as GBS 80. The amino acid sequence of GBS 80 from GBSserotype V, strain isolate 2603 is greater than 90% homologous to theGBS 80 amino acid sequence from GBS serotype III, strain isolates NEM316and COH1 and the GBS 80 amino acid sequence from GBS serotype 1a, strainisolate A909.

As another example, the first AI protein may be GBS 104. The amino acidsequence of GBS 104 from GBS serotype V, strain isolate 2603 is greaterthan 90% homologous to the GBS 104 amino acid sequence from GBS serotypeIII, strain isolates NEM316 and COH1, the GBS 104 amino acid sequencefrom GBS serotype 1a, strain isolate A909, and the GBS 104 amino acidsequence serotype II, strain isolate 18RS21.

Table 12 provides the amino acid sequence identity of GBS 80 and GBS 104across GBS serotypes Ia, Ib, II, III, V, and VIII. The GBS strains inwhich genes encoding GBS 80 and GBS 104 were identified share, onaverage, 99.88 and 99.96 amino acid sequence identity, respectively.This high degree of amino acid identity indicates that an immunogeniccomposition comprising a first protein of GBS 80 or GBS 104 may provideprotection across more than one GBS serotype or strain isolate. TABLE 12Conservation of GBS 80 and GBS 104 amino acid sequences GBS 80 GBS 104Serotype Strains cGH % AA identity cGH % AA identity Ia 090 + 99.79 +100.00 A909 + 100.00 + 100.00 515 − − DK1 − − DK8 − − Davis − − Ib7357b + 100.00 + H36B − − II 18RS21 − + 100.00 DK21 − − III NEM316 +100.00 + 100.00 COH31 + 100.00 + D136 + 100.00 + M732 + 100.00 + 99.88COH1 + 99.79 + 99.88 M781 + 99.79 + 99.88 No type CJB110 + 99.37 +100.00 1169NT − − V CJB111 + 100.00 + 100.00 2603 + 100.00 + 100.00 VIIIJM130013 + 99.79 + 100.00 SMU014 + 100.00 + total 14/22 99.88 +/− 0.1915/22 99.96 +/− 0.056

As another example, the first AI protein may be an AI-2 protein such asGBS 67. The amino acid sequence of GBS 67 from GBS serotype V, strainisolate 2603 is greater than 90% homologous to the GBS 67 amino acidsequence from GBS serotype III, strain isolate NEM316, the GBS 67 aminoacid sequence from GBS serotype 1b, strain isolate H36B, and the GBS 67amino acid sequence from GBS serotype II, strain isolate 17RS21.

As another example, the first AI protein may be an AI-2 protein such asspb1. The amino acid sequence of spb1 from GBS serotype III, strainisolate COH1 is greater than 90% homologous to the spb1 amino acidsequence from GBS serotype Ia, strain isolate A909.

As yet another example, the first AI protein may be an AI-2 protein suchas GBS 59. The amino acid sequence of GBS 59 from GBS serotype II,strain isolate 18RS21 is 100% homologous to the GBS 59 amino acidsequence from GBS serotype V, strain isolate 2603. The amino acidsequence of GBS 59 from GBS serotype V, strain isolate CJB111 is 98%homologous to the GBS 59 amino acid sequence from GBS serotype III,strain isolate NEM316.

The compositions of the invention may also be designed to include Grampositive AI proteins from divergent serotypes or strain isolates, i.e.,to include a first AI protein which is present in one collection ofserotypes or strain isolates of a Gram positive bacteria and a second AIprotein which is present in those serotypes or strain isolates notrepresented by the first AI protein.

For example, the invention may include an immunogenic compositioncomprising a first and second Gram positive bacteria AI protein, whereina polynucleotide sequence encoding for the full length sequence of thefirst AI protein is not present in a similar Gram positive bacterialgenome comprising a polynucleotide sequence encoding for the second AIprotein.

The compositions of the invention may also be designed to include AIproteins from divergent GBS serotypes or strain isolates, i.e., toinclude a first AI protein which is present in one collection of GBSserotypes or strain isolates and a second AI protein which is present inthose serotypes or strain isolates not represented by the first AIprotein.

For example, the invention may include an immunogenic compositioncomprising a first and second GBS AI protein, wherein a polynucleotidesequence encoding for the full length sequence of the first GBS AIprotein is not present in a genome comprising a polynucleotide sequenceencoding for the second GBS AI protein. For example, the first AIprotein could be GBS 80 (such as the GBS 80 sequence from GBS serotypeV, strain isolate 2603). As previously discussed (and depicted in FIG.2), the sequence for GBS 80 in GBS sertoype II, strain isolate 18RS21 isdisrupted. In this instance, the second AI protein could be GBS 104 orGBS 67 (sequences selected from the GBS serotype II, strain isolate18RS21).

Further, the the invention may include an immunogenic compositioncomprising a first and second GBS AI protein, wherein the first GBS AIprotein has detectable surface exposure on a first GBS strain orserotype but not a second GBS strain or serotype and the second GBS AIprotein has detectable surface exposure on a second GBS strain orserotype but not a first GBS strain or serotype. For example, the firstAI protein could be GBS 80 and the second AI protein could be GBS 67. Asseen in Table 15, there are some GBS serotypes and strains that havesurface exposed GBS 80 but that do not have surface exposed GBS 67 andvice versa. An immunogenic composition comprising a GBS 80 and a GBS 67protein may provide protection across a wider group of GBS strains andserotypes. TABLE 15 Antigen surface exposure of GBS 80, GBS 322, GBS104, and GBS 67 GBS strains Type GBS 80 GBS 322 GBS 104 GBS 67 DK1* Ia 0nd 237 478 DK8* 0 213 151 475 Davis* 0 86 271 430 515* 0 227 262 409 0900 0 0 0 A909 0 0 0 0 2986 0 0 157 397 5551 0 36 384 485 2177 Ib 477 323328 66 H36B* 0 105 518 444 7357b- 91 102 309 316 2129 57 71 132 0 551831 nd 60 28 COH1 III 305 130 305 0 D136C 16 460 226 406 COH31 0 479 71273 M732 105 292 101 0 M781 65 224 136 0 1998 95 288 205 350 5376 165 76156 0 5435 93 88 100 0 18RS21 II 0 471 50 103 DK21* 0 342 419 331 305043 188 289 460 5401 170 135 494 618 2141 0 76 0 69 CJB111 V 365 58 355481 2603 62 293 100 105 5364 454 463 379 394 2110 0 11 345 589 2274 IV113 161 465 484 1999 0 55 492 453 2210 0 0 363 574 2928 VII 0 0 0 0SMU071 VIII 556 170 393 79 JM9130013 587 133 436 83 2189 0 0 0 0 5408 00 159 433 CJB110 NT 71 587 169 245 1169* 0 213 371 443 Δ Mean >100 9/4022/38 32/40 25/40 22% 58% 80% 62%

Alternatively, the invention may include an immunogenic compositioncomprising a first and second Gram positive bacteria AI protein, whereinthe polynucleotide sequence encoding the sequence of the first AIprotein is less than 90% (i.e., less than 90, 88, 86, 84, 82, 80, 78,76, 74, 72, 70, 65, 60, 55, 50, 45, 40, 35 or 30 percent) homologousthan the corresponding sequence in the genome of the second AI protein.

The invention may include an immunogenic composition comprising a firstand second GBS AI protein, wherein the polynucleotide sequence encodingthe sequence of the first GBS AI protein is less than 90% (i.e., lessthan 90, 88, 86, 84, 82, 80, 78, 76, 74, 72, 70, 65, 60, 55, 50, 45, 40,35 or 30 percent) homologous than the corresponding sequence in thegenome of the second GBS AI protein. For example, the first GBS AIprotein could be GBS 67 (such as the GBS 67 sequence from GBS serotype1b, strain isolate H36B). As shown in FIGS. 2 and 4, the GBS 67 sequencefor this strain is less than 90% homologous (87%) to the correspondingGBS 67 sequence in GBS serotype V, strain isolate 2603. In thisinstance, the second GBS AI protein could then be the GBS 80 sequencefrom GBS serotype V, strain isolate 2603.

An example immunogenic composition of the invention may comprise adhesinisland proteins GBS 80, GBS 104, GBS 67, and GBS 59, and non-AI proteinGBS 322. FACS analysis of different GBS strains demonstrates that atleast one of these five proteins is always found to be expressed on thesurface of GBS bacteria. An initial FACS analysis of 70 strains of GBSbacteria, obtained from the CDC in the United States (33 strains), ISSin Italy (17 strains), and Houston/Harvard (20 strains), detectedsurface exposure of at least one of GBS 80, GBS 104, GBS 322, GBS 67, orGBS 59 on the surface of the GBS bacteria. FIG. 227 provides the FACSdata obtained for surface exposure of GBS 80, GBS 104, GBS 67, GBS 322,and GBS 59 on each of 37 GBS strains. FIG. 228 provides the FACS dataobtained for surface exposure of GBS 80, GBS 104, GBS 67, GBS 322, andGBS 59 on each of 41 GBS strains obtained from the CDC. As can be seenfrom FIGS. 227 and 228, each GBS strain had surface expression of atleast one of GBS 80, GBS 104, GBS 67, GBS 322, and GBS 59. The surfaceexposure of at least one of these proteins on each bacterial strainindicates that an immunogenic composition comprising these proteins willprovide wide protection across GBS strains and serotypes.

The surface exposed GBS 80, GBS 104, GBS 67, GBS 322, and GBS 59proteins are also present at high levels as determined by FACS. Table 49summarizes the FACS results for the initial 70 GBS strains examined forGBS 80, GBS 104, GBS 67, GBS 322, and GBS 59 surface expression. Aprotein was designated as having high levels of surface expression of aprotein if a five-fold shift in fluorescence was observed when usingantibodies for the protein relative to preimmune control serum. TABLE 49Exposure Levels of GBS 80, GBS 104, GBS 67, GBS 322, and GBS 59 on GBSStrains GBS 80 GBS 104 GBS 67 GBS 59 GBS 322 5-fold shift in 17/70 14/7049/70 46/70 33/70 fluorescence 24% 20% 70% 66% 47% by FACS

Table 50 details which of the surface proteins is highly expressed onthe different GBS serotype. TABLE 50 High Levels of Surface ProteinExpression on GBS Serotypes 5-fold shift in fluorescence by FACS GBS 80GBS 104 GBS 67 GBS 59 GBS 322 Ia + Ib + III  4/36 2/36 22/36 20/36 18/36II + V 11/25 9/25 21/25 21/25 13/25 Others 2/9 3/9  6/9  5/9 2/9

Alternatively, the immunogenic composition of the invention may includeGBS 80, GBS 104, GBS 67, and GBS 322. Assuming that protein antigensthat are highly accessible to antibodies confer 100% protection withsuitable adjuvants, an immunogenic composition containing GBS 80, GBS104, GBS 67, GBS 59 and GBS 322 will provide protection for 89% of GBSstrains and serotypes, the same percentage as an immunogenic compositioncontaining GBS 80, GBS 104, GBS 67, and GBS 322 proteins. See FIG. 229.However, it may be preferable to include GBS 59 in the composition toincrease its immunogenic strength. As seen from Table 50, GBS 59 ishighly expressed on the surface two-thirds of GBS bacteria examined byFACS analysis, unlike GBS 80, GBS 104, and GBS 322, which are highlyexpressed in less than half of GBS bacteria examined. GBS 59opsonophagocytic activity is also comparable to that of a mix of GBS322, GBS 104, GBS 67, and GBS 80 proteins. See FIG. 230.

By way of another example, preferably, the GAS AI proteins included inthe immunogenic compositions of the invention can provide protectionacross more than one GAS serotype or strain isolate. For example, theimmunogenic composition may comprise a first GAS AI protein, wherein theamino acid sequence of said AI protein is at least 90% (i.e., at least90, 91, 92, 93, 94, 95, 96, 97, 98, 99 or 100%) homologous to the aminoacid sequence of a second GAS AI protein, and wherein said first AIprotein and said second AI protein are derived from the genomes ofdifferent GAS serotypes. The first GAS AI protein may also be homologousto the amino acid sequence of a third GAS AI protein, such that thefirst AI protein, the second AI protein and the third AI protein arederived from the genomes of different GAS serotypes. The first AIprotein may also be homologous to the amino acid sequence of a fourthGAS AI protein, such that the first AI protein, the second AI proteinand the third AI protein are derived from the genomes of different GASserotypes.

The compositions of the invention may also be designed to include GAS AIproteins from divergent serotypes or strain isolates, i.e., to include afirst AI protein which is present in one collection of serotypes orstrain isolates of a GAS bacteria and a second AI protein which ispresent in those serotypes or strain isolates not represented by thefirst AI protein.

For example, the first AI protein could be a prtF2 protein (such as the19224141 protein from GAS serotype M12, strain isolate A735). Aspreviously discussed (and depicted in FIG. 164), the sequence for aprtF2 protein is not present in GAS AI types 1 or 2. In this instance,the second AI protein could be collagen binding protein M6_Spy0159 (fromM6 isolate (MGAS10394), which comprises an AI-1) or GAS15 (from M1isolate (SF370), which comprises an AI-2).

Further, the invention may include an immunogenic composition comprisinga first and second GAS AI protein, wherein the first GAS AI protein hasdetectable surface exposure on a first GAS strain or serotype but not asecond GAS strain or serotype and the second GAS AI protein hasdetectable surface exposure on a second GAS strain or serotype but not afirst GAS strain or serotype.

The invention may include an immunogenic composition comprising a firstand second GAS AI protein, wherein the polynucleotide sequence encodingthe sequence of the first GAS AI protein is less than 90% (i.e., lessthan 90, 88, 86, 84, 82, 80, 78, 76, 74, 72, 70, 65, 60, 55, 50, 45, 40,35 or 30 percent) homologous than the corresponding sequence in thegenome of the second GAS AI protein. Preferably the first and second GASAI proteins are subunits of the pilus. More preferably the first andsecond GAS AI proteins are selected from the major pilus formingproteins (i.e., M6_Spy0160 from M6 strain 10394, SPy0128 from M1 strainSF370, SpyM3_(—)0100 from M3 strain 315, SPs0102 from M3 strain SSI,orf80 from M5 isolate Manfredo, spyM18_(—)0128 from M18 strain 8232,SpyoM01000153 from M49 strain 591, 19224137 from M12 strain A735,fimbrial structural subunit from M77 strain ISS4959, fimbrial structuralsubunit from M44 strain ISS3776, fimbrial structural subunit from M50strain ISS3776 ISS 4538, fimbrial structural subunit from M12 strain CDCSS635, fimbrial structural subunit from M23 strain DSM2071, fimbrialstructural subunit from M6 strain CDC SS410). Table 45 provides thepercent identity between the amino acidic sequences of each of the mainpilus forming subunits from GAS AI-1, AI-2, AI-3, and AI-4representative strains (i.e., M6_Spy0160 from M6 strain 10394, SPy0128from M1 strain SF370, SpyM3_(—)0100 from M3 strain 315, SPs0102 from M3strain SSI, orf80 from M5 isolate Manfredo, spyM18_(—)0128 from M18strain 8232, SpyoM01000153 from M49 strain 591, 19224137 from M12 strainA735, Fimbrial structural subunit from M77 strain ISS4959, fimbrialstructural subunit from M44 strain ISS3776, fimbrial structural subunitfrom M50 strain ISS3776 ISS 4538, fimbrial structural subunit from M12strain CDC SS635, fimbrial structural subunit from M23 strain DSM2071,fimbrial structural subunit from M6 strain CDC SS410). TABLE 45Comparison of Amino Acid Sequences of Major Pilus Proteins in the FourGAS AIs

For example, the first main pilus subunit may be selected from bacteriaof GAS serotype M6 strain 10394 and the second main pilus subunit may beselected from bacteria of GAS serotype M1 strain 370. As can be seenfrom Table 45, the main pilus subunits encoded by these strains ofbacteria share only 23% nucleotide identity. An immunogenic compositioncomprising pilus main subunits from each of these strains of bacteria isexpected to provide protection across a wider group of GAS strains andserotypes. Other examples of main pilus subunits that can be used incombination to provide increased protection across a wider range of GASstrains and serotypes include proteins encoded by GAS serotype M5Manfredo isolate and serotype M6 strain 10394, which share 23% sequenceidentity, GAS serotype M18 strain 8232 and serotype M1 strain 370, whichshare 38% sequence identity, GAS serotype M3 strain 315 and serotype M12strain A735, which share 61% sequence identity, and GAS serotype M3strain 315 and serotype M6 strain 10394 which share 25% sequenceidentity.

As also can be seen from Table 45, the amino acid sequences of the fourtypes of main pilus subunits present in GAS are relatively divergent.FIGS. 198-201 provide further tables comparing the percent identity ofadhesin island-encoded surface exposed proteins for different GASserotypes relative to other GAS serotypes harbouring an adhesin islandof the same or a different subtype (GAS AI-1, GAS AI-2, GAS AI-3, andGAS AI-4). See also further discussion below.

Immunizations with the Adhesin Island proteins of the invention arediscussed further in the Examples.

Co-Expression of GBS Adhesin Island Proteins and Role of GBS AI Proteinsin Surface Presentation

In addition to the use of the GBS adhesin island proteins for crossstrain and cross serotype protection, Applicants have identifiedinteractions between adhesin island proteins which appear to affect thedelivery or presentation of the surface proteins on the surface of thebacteria.

In particular, Applicants have discovered that surface exposure of GBS104 is dependent on the concurrent expression of GBS 80. As discussedfurther in Example 2, reverse transcriptase PCR analysis of AI-1 showsthat all of the AI genes are co-transcribed as an operon. Applicantsconstructed a series of mutant GBS containing in frame deletions ofvarious AI-1 genes. (A schematic of the GBS mutants is presented in FIG.7). FACS analysis of the various mutants comparing mean shift valuesusing anti-GBS 80 versus anti-GBS 104 antibodies is presented in FIG. 8.Removal of the GBS 80 operon prevented surface exposure of GBS 104;removal of the GBS 104 operon did not affect surface exposure of GBS 80.While not being limited to a specific theory, it is thought that GBS 80is involved in the transport or localization of GBS 104 to the surfaceof the bacteria. The two proteins may be oligomerized or otherwiseassociated. It is possible that this association involves aconformational change in GBS 104 that facilitates its transition to thesurface of the GBS bacteria.

Pili structures that comprise GBS 104 appear to be of a lower molecularweight than pili structures lacking GBS 104. FIG. 68 shows thatpolyclonal anti-GBS 104 antibodies (see lane marked α-104 POLIC.)cross-hybridize with smaller structures than do polyclonal anti-GBS 80antibodies (see lane marked α-GBS 80 POLIC.).

In addition, Applicants have shown that removal of GBS 80 can causeattenuation, further suggesting the protein contributes to virulence. Asdescribed in more detail in Example 3, the LD₅₀'s for the Δ80 mutant andthe Δ80, Δ104 double mutant were reduced by an order of magnitudecompared to wildtype and Δ104 mutant.

The sortases within the adhesin island also appear to play a role inlocalization and presentation of the surface proteins. As discussedfurther in Example 4, FACS analysis of various sortase deletion mutantsshowed that removal of sortase SAG0648 prevented GBS 104 from reachingthe surface and slightly reduced the surface exposure of GBS 80. Whensortase SAG0647 and sortase SAG0648 were both knocked out, neither GBS80 nor GBS 104 were surface exposed. Expression of either sortase alonewas sufficient for GBS 80 to arrive at the bacterial surface. Expressionof SAG0648, however, was required for GBS 104 surface localization.

Accordingly, the compositions of the invention may include two or moreAI proteins, wherein the AI proteins are physically or chemicallyassociated. For example, the two AI proteins may form an oligomer. Inone embodiment, the associated proteins are two AI surface proteins,such as GBS 80 and GBS 104. The associated proteins may be AI surfaceproteins from different adhesin islands, including host cell adhesinisland proteins if the AI surface proteins are expressed in arecombinant system. For example, the associated proteins may be GBS 80and GBS 67.

Adhesin Island Proteins from Other Gram Positive Bacteria

Applicants' identification and analysis of the GBS adhesin islands andthe immunological and biological functions of these AI proteins andtheir pilus structures provides insight into similar structures in otherGram positive bacteria.

As discussed above, “Adhesin Island” or “AI” refers to a series of openreading frames within a bacterial genome that encode for a collection ofsurface proteins and sortases. An Adhesin Island may encode for aminoacid sequences comprising at least one surface protein. The AdhesinIsland may encode at least one surface protein. Alternatively, anAdhesin Island may encode for at least two surface proteins and at leastone sortase. Preferably, an Adhesin Island encodes for at least threesurface proteins and at least two sortases. One or more of the surfaceproteins may include an LPXTG motif (such as LPXTG (SEQ ID NO: 122)) orother sortase substrate motif. One or more AI surface proteins mayparticipate in the formation of a pilus structure on the surface of theGram positive bacteria.

Gram positive adhesin islands of the invention preferably include adivergently transcribed transcriptional regulator. The transcriptionalregulator may regulate the expression of the AI operon.

The invention includes a composition comprising one or more Grampositive bacteria AI surface proteins. Such AI surface proteins may beassociated in an oligomeric or hyperoligomeric structure.

Preferred Gram positive adhesin island proteins for use in the inventionmay be derived from Staphylococcus (such as S. aureus), Streptococcus(such as S. agalactiae (GBS), S. pyogenes (GAS), S. pneumonaie, S.mutans), Enterococcus (such as E. faecalis and E. faecium), Clostridium(such as C. difficile), Listeria (such as L. monocytogenes) andCorynebacterium (such as C. diphtheria).

One or more of the Gram positive AI surface protein sequences typicallyinclude an LPXTG motif or other sortase substrate motif. Gram positiveAI surface proteins of the invention may affect the ability of the Grampositive bacteria to adhere to and invade epithelial cells. AI surfaceproteins may also affect the ability of Gram positive bacteria totranslocate through an epithelial cell layer. Preferably, one or more AIsurface proteins are capable of binding to or otherwise associating withan epithelial cell surface. Gram positive AI surface proteins may alsobe able to bind to or associate with fibrinogen, fibronectin, orcollagen.

Gram positive AI sortase proteins are predicted to be involved in thesecretion and anchoring of the LPXTG containing surface proteins. A Grampositive bacteria AI may encode for at least one surface exposedprotein. The Adhesin Island may encode at least one surface protein.Alternatively, a Gram positive bacteria AI may encode for at least twosurface exposed proteins and at least one sortase. Preferably, a Grampositive AI encodes for at least three surface exposed proteins and atleast two sortases.

Gram positive AI surface proteins may be covalently attached to thebacterial cell wall by membrane-associated transpeptidases, such as anAI sortase. The sortase may function to cleave the surface protein,preferably between the threonine and glycine residues of an LPXTG motif.The sortase may then assist in the formation of an amide link betweenthe threonine carboxyl group and a cell wall precursor such as lipid II.The precursor can then be incorporated into the peptidoglycan via thetransglycoslylation and transpeptidation reactions of bacterial wallsynthesis. See Comfort et al., Infection & Immunity (2004) 72(5):2710-2722. Typically, Gram positive bacteria AI surface proteins of theinvention will contain an N-terminal leader or secretion signal tofacilitate translocation of the surface protein across the bacterialmembrane.

Gram positive bacteria AI surface proteins of the invention may affectthe ability of the Gram positive bacteria to adhere to and invade targethost cells, such as epithelial cells. Gram positive bacteria AI surfaceproteins may also affect the ability of the gram positive bacteria totranslocate through an epithelial cell layer. Preferably, one or more ofthe Gram positive AI surface proteins are capable of binding to or otherassociating with an epithelial cell surface. Further, one or more Grampositive AI surface proteins may bind to fibrinogen, fibronectin, orcollagen protein.

In one embodiment, the invention includes a composition comprisingoligomeric, pilus-like structures comprising a Gram positive bacteria AIsurface protein. The oligomeric, pilus-like structure may comprisenumerous units of the AI surface protein. Preferably, the oligomeric,pilus-like structures comprise two or more AI surface proteins. Stillmore preferably, the oligomeric, pilus-like structure comprises ahyper-oligomeric pilus-like structure comprising at least two (e.g., 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50,60, 70, 80, 90, 100, 120, 140, 150, 200 or more) oligomeric subunits,wherein each subunit comprises an AI surface protein or a fragmentthereof. The oligomeric subunits may be covalently associated via aconserved lysine within a pilin motif. The oligomeric subunits may becovalently associated via an LPXTG motif, preferably, via the threonineamino acid residue.

Gram positive bacteria AI surface proteins or fragments thereof to beincorporated into the oligomeric, pilus-like structures of the inventionwill preferably include one or both of a pilin motif comprising aconserved lysine residue and an E box motif comprising a conservedglutamic acid residue.

The oligomeric, pilus like structures may be used alone or in thecombinations of the invention. In one embodiment, the inventioncomprises a Gram positive bacteria Adhesin Island in oligomeric form,preferably in a hyperoligomeric form.

The oligomeric, pilus-like structures of the invention may be combinedwith one or more additional Gram positive AI proteins (from the same ora different Gram positive species or genus). In one embodiment, theoligomeric, pilus-like structures comprise one or more Gram positivebacteria AI surface proteins in combination with a second Gram positivebacteria protein. The second Gram positive bacteria protein may be aknown antigen, and need not normally be associated with an AI protein.

The oligomeric, pilus-like structures may be isolated or purified frombacterial cultures overexpressing a Gram positive bacteria AI surfaceprotein. The invention therefore includes a method for manufacturing anoligomeric Adhesin Island surface antigen comprising culturing a Grampositive bacteria adapted for increased AI protein expression andisolation of the expressed oligomeric Adhesin Island protein from theGram positive bacteria. The AI protein may be collected from secretionsinto the supernatant or it may be purified from the bacterial surface.The method may further comprise purification of the expressed AdhesinIsland protein. Preferably, the Adhesin Island protein is in ahyperoligomeric form.

Gram positive bacteria are preferably adapted to increase AI proteinexpression by at least two (e.g., 2, 3, 4, 5, 8, 10, 15, 20, 25, 30, 35,40, 45, 50, 60, 70, 80, 90, 100, 125, 150 or 200) times wild typeexpression levels.

Gram positive bacteria may be adapted to increase AI protein expressionby means known in the art, including methods of increasing gene dosageand methods of gene upregulation. Such means include, for example,transformation of the Gram positive bacteria with a plasmid encoding theAI protein. The plasmid may include a strong promoter or it may includemultiple copies of the sequence encoding the AI protein. Optionally, thesequence encoding the AI protein within the Gram positive bacterialgenome may be deleted. Alternatively, or in addition, the promoterregulating the Gram positive Adhesin Island may be modified to increaseexpression.

The invention further includes Gram positive bacteria which have beenadapted to produce increased levels of AI surface protein. Inparticular, the invention includes Gram positive bacteria which havebeen adapted to produce oligomeric or hyperoligomeric AI surfaceprotein. In one embodiment, the Gram positive bacteria of the inventionare inactivated or attenuated to permit in vivo delivery of the wholebacteria, with the AI surface protein exposed on its surface.

The invention further includes Gram positive bacteria which have beenadapted to have increased levels of expressed AI protein incorporated inpili on their surface. The Gram positive bacteria may be adapted to haveincreased exposure of oligomeric or hyperoligomeric AI proteins on itssurface by increasing expression levels of a signal peptidasepolypeptide. Increased levels of a local signal peptidase expression inGram positive bacteria (such us LepA in GAS) are expected to result inincreased exposure of pili proteins on the surface of Gram positivebacteria. Increased expression of a leader peptidase in Gram positivemay be achieved by any means known in the art, such as increasing genedosage and methods of gene upregulation. The Gram positive bacteriaadapted to have increased levels of leader peptidase may additionally beadapted to express increased levels of at least one pili protein.

Alternatively, the AI proteins of the invention may be expressed on thesurface of a non-pathogenic Gram positive bacteria, such as Streptococusgordonii (See, e.g., Byrd et al., “Biological consequences of antigenand cytokine co-expression by recombinant Streptococcus gordonii vaccinevectors”, Vaccine (2002) 20:2197-2205) or Lactococcus lactis (See, e.g.,Mannam et al., “Mucosal VaccineMade from Live, Recombinant Lactococcuslactis Protects Mice against Pharangeal Infection with Streptococcuspyogenes” Infection and Immunity (2004) 72(6):3444-3450). It has alreadybeen demonstrated, above, that L. lactis expresses GBS and GAS AIpolypeptides in oligomeric form and on its surface.

Alternatively, the oligomeric, pilus-like structures may be producedrecombinantly. If produced in a recombinant host cell system, the Grampositive bacteria AI surface protein will preferably be expressed incoordination with the expression of one or more of the AI sortases ofthe invention. Such AI sortases will facilitate oligomeric orhyperoligomeric formation of the AI surface protein subunits.

Gram positive AI Sortases of the invention will typically have a signalpeptide sequence within the first 70 amino acid residues. They may alsoinclude a transmembrane sequence within 50 amino acid residues of the Cterminus. The sortases may also include at least one basic amino acidresidue within the last 8 amino acids. Preferably, the sortases have oneor more active site residues, such as a catalytic cysteine andhistidine.

Adhesin island surface proteins from two or more Gram positive bacterialgenus or species may be combined to provide an immunogenic compositionfor prophylactic or therapeutic treatment of disease or infection of twomore Gram positive bacterial genus or species. Optionally, the adhesinisland surface proteins may be associated together in an oligomeric orhyperoligomeric structure.

In one embodiment, the invention comprises an adhesin island surfaceproteins from two or more Streptococcus species. For example, theinvention includes a composition comprising a GBS AI surface protein anda GAS adhesin island surface protein. As another example, the inventionincludes a composition comprising a GAS adhesin island surface proteinand a S. pneumoniae adhesin island surface protein.

In one embodiment, the invention comprises an adhesin island surfaceprotein from two or more Gram positive bacterial genus. For example, theinvention includes a composition comprising a Streptococcus adhesinisland protein and a Corynebacterium adhesin island protein.

Examples of AI sequences in several Gram positive bacteria are discussedfurther below.

Streptococcus pyogenes (GAS)

As discussed above, Applicants have identified at least four differentGAS Adhesin Islands. These adhesion islands are thought to encodesurface proteins which are important in the bacteria's virulence, andApplicants have obtained the first electron micrographs revealing thepresence of these adhesin island proteins in hyperoligomeric pilusstructures on the surface of Group A Streptococcus.

Group A Streptococcus is a human specific pathogen which causes a widevariety of diseases ranging from pharyngitis and impetigo through lifethreatening invasive disease and necrotizing fasciitis. In addition,post-streptococcal autoimmune responses are still a major cause ofcardiac pathology in children.

Group A Streptococcal infection of its human host can generally occur inthree phases. The first phase involves attachment and/or invasion of thebacteria into host tissue and multiplication of the bacteria within theextracellular spaces. Generally this attachment phase begins in thethroat or the skin. The deeper the tissue level infected, the moresevere the damage that can be caused. In the second stage of infection,the bacteria secrete a soluble toxin that diffuses into the surroundingtissue or even systemically through the vasculature. This toxin binds tosusceptible host cell receptors and triggers innappropropriate immuneresponses by these host cells, resulting in pathology. Because the toxincan diffuse throughout the host, the necrosis directly caused by the GAStoxins may be physically located in sites distant from the bacterialinfection. The final phase of GAS infection can occur long after theoriginal bacteria have been cleared from the host system. At this stage,the host's previous immune response to the GAS bacteria due to crossreactivity between epitopes of a GAS surface protein, M, and hosttissues, such as the heart. A general review of GAS infection can befound in Principles of Bacterial Pathogeneis, Groisman ed., Chapter 15(2001).

In order to prevent the pathogenic effects associated with the laterstages of GAS infection, an effective vaccine against GAS willpreferably facilitate host elimination of the bacteria during theinitial attachment and invasion stage.

Isolates of Group A Streptococcus are historically classified accordingto the M surface protein described above. The M protein is surfaceexposed trypsin-sensitive protein generally comprising two polypeptidechains complexed in an alpha helical formation. The carboxyl terminus isanchored in the cytoplasmic membrane and is highly conserved among allgroup A streptococci. The amino terminus, which extends through the cellwall to the cell surface, is responsible for the antigenic variabilityobserved among the 80 or more serotypes of M proteins.

A second layer of classification is based on a variable,trypsin-resistant surface antigen, commonly referred to as theT-antigen. Decades of epidemiology based on M and T serological typinghave been central to studies on the biological diversity and diseasecausing potential of Group A Streptococci. While the M-protein componentand its inherent variability have been extensively characterized, evenafter five decades of study, there is still very little known about thestructure and variability of T-antigens. Antisera to define T types arecommercially available from several sources, including Sevapharma(http://www.sevapharma.cz/en).

The gene coding for one form of T-antigen, T-type 6, from an M6 strainof GAS (D741) has been cloned and characterized and maps to anapproximately 11 kb highly variable pathogenicity island. Schneewind etal., J. Bacteriol. (1990) 172(6):3310-3317. This island is known as theFibronectin-binding, Collagen-binding T-antigen (FCT) region because itcontains, in addition to the T6 coding gene (tee6), members of a familyof genes coding for Extra Cellular Matrix (ECM) binding proteins. Bessenet al., Infection & Immunity (2002) 70(3):1159-1167. Several of theprotein products of this gene family have been shown to directly bindeither fibronectin and/or collagen. See Hanski et al., Infection &Immunity (1992) 60(12):5119-5125; Talay et al., Infection & Immunity(1992(60(9):3837-3844; Jaffe et al. (1996) 21(2):373-384; Rocha et al.,Adv Exp Med Biol. (1997) 418:737-739; Kreikemeyer et al., J Biol Chem(2004) 279(16):15850-15859; Podbielski et al., Mol. Microbiol. (1999)31(4):1051-64; and Kreikemeyer et al., Int. J. Med Microbiol (2004)294(2-3):177-88. In some cases direct evidence for a role of theseproteins in adhesion and invasion has been obtained.

Applicants raised antiserum against a recombinant product of the tee6gene and used it to explore the expression of T6 in M6 strain ISS3650.In immunoblot of mutanolysin extracts of this strain, the antiserumrecognized, in addition to a band corresponding to the predictedmolecular mass of the tee6 gene product, very high molecular weightladders ranging in mobility from about 100 kDa to beyond the resolutionof the 3-8% gradient gels used. See FIG. 163A, last lane labeled “M6Tee6.”

This pattern of high molecular weight products is similar to thatobserved in immunoblots of the protein components of the pili identifiedin Streptococcus agalactiae (described above) and previously inCorynebacterium diphtheriae. Electron microscropy of strain M6 ISS3650with antisera specific for the product of tee6 revealed abundant surfacestaining and long pilus like structures extending up to 700 nanometersfrom the bacterial surface, revealing that the T6 protein, one of theantigens recognized in the original Lancefield serotyping system, islocated within a GAS Adhesin Island (GAS AI-1) and forms long covalentlylinked pilus structures. See FIG. 163I.

In addition to the tee6 gene, the FCT region in M6_ISS3650 (GAS AI-1)contains two other genes (prtF1 and cpa) predicted to code for surfaceexposed proteins; these proteins are characterized as containing thecell wall attachment motif LPXTG. Western blot analysis using antiserumspecific for PrtF1 detected a single molecular species withelectrophoretic mobility corresponding to the predicted molecular massof the protein and one smaller band of unknown origin. Western blotanalysis using antisera specific for Cpa recognized a high molecularweight covalently linked ladder (FIG. 163A, second lane). Immunogoldlabelling of Cpa with specific antiserum followed by transmissionelectron microscopy detected an abundance of Cpa at the cell surface andonly occasional structures extending from the cell surface (FIG. 163J).

Four classes of FCT region can be discerned by the types and order ofthe genes contained within the region. The FCT region of strains oftypes M3, M5, M18 and M49 have a similar organization whereas those ofM6, M1 and M12 differ. See FIG. 164. As discussed below, these four FCTregions correlate to four GAS Adhesin Island types (AI-1, AI-2, AI-3 andAI-4).

Applicants discovery of genes coding for pili in the FCT region ofstrain M6_ISS3650 prompted them to examine the predicted surface exposedproteins in the variant FCT regions of three other GAS strains of havingdifferent M-type (M1_SF370, M5_ISS4883 and M12_(—)20010296) representingthe other three FCT variants. Each gene present in the FCT region ofeach bacteria was cloned and expressed. Antisera specific for eachrecombinant protein was then used to probe mutanolysin extracts of therespective strains (6). In M1 strain SF370, there are three predictedsurface proteins (Cpa (also referred to as M1_(—)126 and GAS 15),M1_(—)128 (a fimbrial protein also referred to as Spy0128 and GAS 16),and M1_(—)130 (also referred to as Spy0130 and GAS 18)) (GAS AI-2).Antisera specific for each surface protein reacted with a ladder of highmolecular weight material (FIG. 163B). Immunogold staining of M1 strainSF370 with antiserum specific for M1_(—)128 revealed pili structuressimilar to those seen when M6 strain ISS3650 was immunogold stained withantiserum specific for tee6 (See FIG. 1163K). Antisera specific forsurface proteins Cpa and M1_(—)130 revealed abundant surface stainingand occasional structures extending from the surface of M1 strain SF370bacteria (FIG. 163S).

The M1_(—)128 protein appears to be necessary for polymerization of Cpaand M1_(—)130 proteins. If the M1_(—)128 gene in M1_SF370 was deleted,Western blot analysis using antibodies that hybridize to Cpa andM1_(—)130 no longer detected high molecular weight ladders comprisingthe Cpa and M1_(—)130 proteins (FIG. 163 E). See also FIGS. 177 A-Cwhich provide the results of Western blot analysis of the M1_(—)128(Δ128) deleted bacteria using anti-M1_(—)130 antiserum (FIG. 177 A),anti-M1_(—)128 antiserum (FIG. 177 B), and anti-M1_(—)126 antiserum(FIG. 177 C). High molecular weight ladders, indicative of pilusformation on the surface of M1 strain SF370, could not be detected byany of the three antisera in Δ128 bacteria. If the Δ128 bacteria weretransformed with a plasmid containing the gene for M1_(—)128, Westernblot analysis using antisera specific for Cpa and M1_(—)130 againdetected high molecular weight ladders (FIG. 163 H).

In agreement with the Western blot analysis, immunoelectron microscopyfailed to detect pilus assembly on the Δ128 strain SF370 bacteria usingM1_(—)128 antisera (FIG. 178 B). Although Δ128 SF370 bacteria wereunable to form pili, M1_(—)126 (cpa) and M1_(—)130, which containsortase substrate motifs, were present on the bacteria's surface. FACSanalysis of the M1_(—)128 deleted (Δ128) strain SF370 bacteria alsodetected both M1_(—)126 and M1_(—)130 on the surface of the Δ128 strainSF370 bacteria. See FIG. 179 D and F, which show a shift in fluorescencewhen antibodies immunoreactive to M1_(—)126 and M1_(—)130 are used onΔ128 bacteria. As expected, virtually no shift in fluorescence isobserved when antibodies immunoreactive to M1_(—)128 are used with theΔ128 bacteria (FIG. 179 E).

By contrast, deletion of the M1_(—)130 gene did not effectpolymerization of M1_(—)128 (FIG. 163 F). See also FIGS. 177 A-C, whichprovide Western blot analysis results of the M1_(—)130 deleted (Δ130)strain SF370 bacteria using anti-M1_(—)130 (FIG. 177 A), anti-M1_(—)128(FIG. 177 B), and anti-M1_(—)126 antiserum (FIG. 177 C). Theanti-M1_(—)128 and anti-M1_(—)126 antiserum both detected the presenceof high molecular weight ladders in the A130 strain SF370 bacteria,indicating that the A130 bacteria form pili that comprise M1_(—)126 andM1_(—)128 polypeptides in the absence of M1_(—)130. As expected, theWestern blot probed with antiserum immunoreactive with M1_(—)130 did notdetect any proteins for the Δ130 bacteria (FIG. 177A).

Hence, the composition of the pili in GAS resembles that previouslydescribed for both C. diphtheria (7, 8) and S. agalactiae (describedabove) (9) in that each pilus is formed by a backbone component whichabundantly stains the pili in EM and is essential for the incorporationof the other components.

Also similar to C. diphtheria, elimination of the srtC1 gene from theFCT region of M1_SF370 abolished polymerization of all three proteinsand assembly of pili (FIG. 163 G). See also FIGS. 177 A-C, which provideWestern blot analysis of the SrtC1 deleted (ΔsrtC1) strain SF370bacteria using anti-M1_(—)130 (FIG. 177 A), anti-M1_(—)128 (FIG. 177 B),and anti-M1_(—)126 antiserum (FIG. 177 C). None of the three antiseraimmunoreacted with high molecular weight structures (pili) in the ΔSrtC1bacteria. Confirming that deletion of the SrtC1 gene abrogates pilusassembly in strain SF370, immunoelectron microscopy using antiseraagainst M1_(—)128 failed to detect pilus formation on the bacteriasurface. See FIG. 178 C. Although no assembled pili were detected onΔSrtC1 SF370, M1_(—)128 proteins could be detected on the surface ofSF370. Thus, it appeared that SrtC1 deletion prevented pilus assembly onthe surface of the SF370 bacteria, but not anchoring of the proteinsthat comprise pili to the bacterial cell wall. FACS analysis of theΔSrtC1 strain SF370 confirmed that deletion of SrtC1 does not eliminatecell surface expression of M1_(—)126, M1_(—)128 or M1_(—)130. See FIG.179 G-I, which show a shift in fluorescence when antibodiesimmunoreactive to M1_(—)126 (FIG. 179 G), M1_(—)128 (FIG. 179 H), andM1_(—)130 (FIG. 179 I) are used to detect cell surface proteinexpression on ΔSrtC1 bacteria. Thus, SrtC1 deletion prevents pilusformation, but not surface anchoring of proteins involved in pilusformation on the surface of bacteria. Another sortase is possiblyinvolved in anchoring of the proteins to the bacteria surface. Piluspolymerization in C. diphtheriae is also dependent on particular sortaseenzyme whose gene resides at the same genetic locus as the piluscomponents (7, 8).

The LepA signal peptidase, Spy0127, also appears to be essential forpilus assembly in strain SF370. LepA deletion mutants (ΔLepA) of strainSF370 fail to assemble pili on the cell surface. Not only are the ΔLepAmutants unable to assemble pili, they are also deficient at cell surfaceM1 expression. See FIG. 180, which provides a FACS analysis of thewildtype (A) and ΔLepA mutant (B) SF370 bacteria using M1 antisera. Noshift in fluorescence is observed for the ΔLepA mutant bacteria in thepresence of M1 immune serum. It is possible that these deletion mutantsof LepA will be useful for detecting non-M, non-pili, surface exposedantigens on the surface of GAS, or any Gram positive bacteria. Theseantigens may also be useful in immunogenic compositions.

Pili were also observed in M5 strain ISS4882 and M12 strain 20010296.The M5 strain ISS4882 contains genes for four predicted surface exposedproteins (GAS AI-3). Antisera against three of the four products of theFCT region (GAS AI-3) of M5_ISS4883 (Cpa, M5_orf80, M5_orf82) stainedhigh molecular weight ladders in Western blot analysis (FIG. 163 C).Long pili were visible when antisera against M5_orf80 was used inimmunogold staining followed by electron microscopy (FIG. 163L).

The M12 strain 20010296 contains genes for five predicted surfaceexposed proteins. (GAS AI-4) Antisera against three of the five productsof the FCT region (GAS AI-4) of M12_(—)20010296 (Cpa, EftLSL.A, Orf2)stained high molecular weight ladders in Westen blot analysis (FIG. 163D). Long pili were visible when antisera against EftLSL.A were used(FIG. 163M).

The major pilus forming proteins identified in the four strains studiedby applicants (T6, M1_(—)128, M5_orf80 and EftLSL.A) share between 23%and 65% amino acid identity in any pairwise comparison, indicating thateach pilus may represent a different Lancefield T-antigen. Each pilus ispart of a trypsin resistant structure on the GAS bacteria surface, as isthe case for the Lancefield T antigens. See FIG. 165, which provides aFACS analysis of bacteria harboring each of the FCT types that had orhad not been treated with trypsin (6). Following treatment, surfaceexpression of the pilus proteins was assayed by indirectimmunofluorescence and flow cytometry using antibodies specific for thepilus proteins, the bacteria's respective M proteins, or surfaceproteins not associated with the pili (FIG. 165). Staining the cellswith sera specific for proteins associated with the pili was noteffected by trypsin treatment, whereas trypsin treatment substantiallyreduced detection of M-proteins or surface proteins not associated withpili.

The pili structures identified on the surface of the GAS bacteria wereconfirmed to be Lancefield T antigens when commercially availableT-serotyping sera detected the pili on the surface of bacteria. Westernblot analysis was initially performed to determine if polyvalent serumpools (designated T, U, W, X, and Y) could detect recombinant proteinsfor each of the major pilis components (T6, M1_(—)128, M5_orf80 andEftLSL.A) identified in the strains of bacteria discussed above. Pool U,which contains the T6 serum, recognized the T6 protein specifically (asurface exposed pilus protein from GAS AI-1) (FIG. 166 B). Pool Tspecifically recognized M1_(—)128 (a surface exposed pilus protein fromGAS AI-2) (FIG. 166 A). Pool W recognized both M5_orf80 and EftLSL.A(FIG. 166 C). Using monovalent sera representative of each of thecomponents of each polyvalent pool, applicants confirmed the specificityof the T6 antigen (corresponding to a surface exposed pilus protein fromGAS AI-1) (FIG. 166 E) and identified M1_(—)28 as antigen T1(corresponding to a surface exposed pilus protein from GAS AI-2) (FIG.166 D), EftLSL.A as antigen T12 (corresponding to a surface exposedpilus protein from GAS AI-4) (FIG. 166 G) and M5_orf80 as a commonantigen recognized by the related sera T5, T27 and T44 (corresponding toa surface exposed pilus protein from GAS AI-3).

Confirming applicants observations, discussed above, that deleting theM1_(—)128 gene from M1_SF370 abolishes pilus formation, the pool T serastained whole M1_SF370 bacteria (FIG. 166 H) but failed to stainM1_SF370 bacteria lacking the M1_(—)128 gene (FIG. 166 I).

As discussed above, Applicants have identified at least four differentGroup A Streptococcus Adhesin Islands. While these GAS AI sequences canbe identified in numerous M types, Applicants have surprisinglydiscovered a correlation between the four main pilus subunits from thefour different GAS AI types and specific T classifications. While othertrypsin-resistant surface exposed proteins are likely also implicated inthe T classification designations, the discovery of the role of the GASadhesin islands (and the associated hyper-oligomeric pilus likestructures) in T classification and GAS serotype variance has importantimplications for prevention and treatment of GAS infections. Applicantshave identified protein components within each of the GAS adhesinislands which are associated with the pilus formation. These proteinsare believed to be involved in the bacteria's initial adherencemechanisms. Immunological recognition of these proteins may allow thehost immune response to slow or prevent the bacteria's transition intothe more pathogenic later stages of infection. In addition, the GAS pilimay be involved in formation of biofilms. Applicants have discoveredthat the GBS pili structures appear to be implicated in the formation ofbiofilms (populations of bacteria growing on a surface, often enclosedin an exopolysaccharide matrix). Biofilms are generally associated withbacterial resistance, as antibiotic treatments and host immune responseare frequently unable to erradicate all of the bacteria components ofthe biofilm. Direction of a host immune response against surfaceproteins exposed during the first steps of bacterial attachment (i.e.,before complete biofilm formation) is preferable.

The invention therefore provides for improved immunogenic compositionsagainst GAS infection which may target GAS bacteria during their initialattachment efforts to the host epithelial cells and may provideprotection against a wide range of GAS serotypes. The immunogeniccompositions of the invention include GAS AI surface proteins which maybe formulated in an oligomeric, or hyperoligomeric (pilus) form. Theinvention also includes combinations of GAS AI surface proteins.Combinations of GAS AI surface proteins may be selected from the sameadhesin island or they may be selected from different GAS adhesinislands.

The invention comprises compositions comprising a first GAS AI proteinand a second GAS AI protein wherein the first and second GAS AI proteinsare derived from different GAS adhesin islands. For example, theinvention includes a composition comprising at least two GAS AI proteinswherein the GAS AI proteins are encoded by the adhesin islands selectedfrom the group consisting of GAS AI-1 and AI-2; GAS AI-1 and GAS AI-3;GAS AI-1 and GAS AI-4; GAS AI-2 and GAS AI-3; GAS AI-2 and GAS AI-4; andGAS AI-3 and GAS AI-4. Preferably the two GAS AI proteins are derivedfrom different T-types.

A schematic arrangement of GAS Adhesin Island sequences is set forth inFIG. 162. In all strains, the AI region is flanked by the highlyconserved open reading frames M1_(—)123 and M1-136. Between three andfive genes in each locus code for surface proteins containing LPXTGmotifs. These surface proteins also all belong to the family of genescoding for ECM binding adhesins.

Adhesin island sequences can be identified in numerous M types of GroupA Streptococcus. Examples of AI sequences within M1, M6, M3, M5, M12,M18, and M49 serotypes are discussed below.

GAS Adhesin Islands generally include a series of open reading frameswithin a GAS genome that encode for a collection of surface proteins andsortases. A GAS Adhesin Island may encode for amino acid sequencescomprising at least one surface protein. Alternatively, a GAS AdhesinIsland may encode for at least two surface proteins and at least onesortase. Preferably, a GAS Adhesin Island encodes for at least threesurface proteins and at least two sortases. One or more of the surfaceproteins may include an LPXTG motif (such as LPXTG (SEQ ID NO: 122)) orother sortase substrate motif. One or more GAS AI surface proteins mayparticipate in the formation of a pilus structure on the surface of theGram positive bacteria.

GAS Adhesin Islands of the invention preferably include a divergentlytranscribed transcriptional regulator. The transcriptional regulator mayregulate the expression of the GAS AI operon. Examples oftranscriptional regulators found in GAS AI sequences include RofA andNra.

The GAS AI surface proteins may bind or otherwise adhere to fibrinogen,fibronectin, or collagen. One or more of the GAS AI surface proteins maycomprise a fimbrial structural subunit.

One or more of the GAS AI surface proteins may include an LPXTG motif orother sortase substrate motif. The LPXTG motif may be followed by ahydrophobic region and a charged C terminus, which are thought to retardthe protein in the cell membrane to facilitate recognition by themembrane-localized sortase. See Barnett, et al., J. Bacteriology (2004)186 (17): 5865-5875.

GAS AI sequences may be generally categorized as Type 1, Type 2, Type 3,or Type 4, depending on the number and type of sortase sequences withinthe island and the percentage identity of other proteins (with theexception of RofA and cpa) within the island. FIG. 167 provides a chartindicating the number and type of sortase sequences identified withinthe adhesin islands of various strains and serotypes of GAS. As can beseen in this figure, all GAS strains and serotypes thus farcharacterized as an AI-1 have a SrtB type sortase, all GAS strains andserotypes thus far characterized as an AI-2 have SrtB and SrtC1 typesortases, all GAS strains and serotypes thus far characterized as anAI-3 have a SrtC2 type sortase, and all GAS strains and serotypes thusfar characterized as an AI-4 have SrtB and SrtC2 type sortases. Acomparison of the percentage identity of sequences within the adhesinislands was presented in Table 45, see above.

(1) Adhesin Island Sequence within M6: GAS Adhesin Island 1 (“GAS AI-1”)

A GAS Adhesin Island within M6 serotype (MGAS10394) is outlined in Table4 below. This GAS adhesin island 1 (“GAS AI-1”) comprises surfaceproteins, a srtB sortase and a rofA divergently transcribedtranscriptional regulator.

GAS AI-1 surface proteins include Spy0157 (a fibronectin bindingprotein), Spy0159 (a collagen adhesion protein) and Spy0160 (a fimbrialstructural subunit). Preferably, each of these GAS AI-1 surface proteinsincludes an LPXTG sortase substrate motif, such as LPXTG (SEQ ID NO:122) or LPXSG (SEQ ID NO: 134) (conservative replacement of threoninewith serine).

GAS AI-1 includes a srtB type sortase. GAS srtB sortases may preferablyanchor surface proteins with an LPSTG motif (SEQ ID NO: 166),particularly where the motif is followed by a serine. TABLE 4 GAS AI-1sequences from M6 isolate (MGAS10394) Sortase substrate AI-1 sequencesequence or sortase identifier type functional description M6_Spy0156Transcriptional regulator (rofA) M6_Spy0157 LPXTG Fibronectin-bindingprotein M6_Spy0158 Reverse transcriptase M6_Spy0159 LPXSG Collagenadhesion protein M6_Spy0160 LPXTG Fimbrial structural subunit M6_Spy0161srtB Sortase

M6_Spy0160 appears to be present on the surface of GAS as part ofoligomeric (pilus) structures. FIGS. 127-132 present electronmicrographs of GAS serotype M6 strain 3650 immunogold stained for M6Spy0160 using anti-M6 Spy0160 antiserum. Oligomeric or hyperoligmericstructures labelled with gold particles can be seen extending from thesurface of the GAS in each of these figures, indicating the presence ofmultiple M6_Spy0160 polypeptides in the oligomeric or hyperoligomericstructures. FIG. 176 A-F present electron micrographs of GAS M6 strain2724 immunogold stained for M6_Spy0160 using anti-M6_Spy0160 antiserum(FIGS. 176 A-E) or immunogold stained for M6_Spy0159 usinganti-M6_Spy0159 antiserum (FIG. 176 F). Oligomeric or hyperoligomericstructures labelled with gold particles can again be seen extending fromthe surface of the M6 strain 2724 GAS bacteria immunogold stained forM6_Spy0160. M6_Spy0159 is also detected on the surface of the M6 strain2724 GAS.

FACS analysis has confirmed that the GAS AI-1 surface proteinsspyM6_(—)0159 and spyM6_(—)0160 are indeed expressed on the surface ofGAS. FIG. 73 provides the results of FACS analysis for surfaceexpression of spyM6_(—)0159 on each of GAS serotypes M6 2724, M6 3650,and M6 2894. A shift in fluorescence is observed for each GAS serotypewhen anti-spyM6_(—)0159 antiserum is present, demonstrating cell surfaceexpression. Table 18, below, quantitatively summarizes the FACSfluorescence values obtained for each GAS serotype in the presence ofpre-immune antiserum, anti-spyM6_(—)0159 antiserum, and the differencein fluorescence value between the pre-immune and anti-spyM6_(—)0159antiserum. TABLE 18 Summary of FACS values for surface expression ofspyM6_0159 2724 3650 2894 Pre- Anti- Pre- Anti- Pre- Anti- immunespyM6_0159 Change immune spyM6_0159 Change immune spyM6_0159 Change134.84 427.48 293 149.68 712.62 563 193.86 597.8 404

FIG. 74 provides the results of FACS analysis for surface expression ofspyM6_(—)0160 on each of GAS serotypes M6 2724, M6 3650, and M6 2894. Inthe presence of of anti-spyM6_(—)0160 antiserum, a shift in fluorescenceis observed for each GAS serotype, which demonstrates its cell surfaceexpression. Table 19, below, quantitatively summarizes the FACSfluorescence values obtained for each GAS serotype in the presence ofpre-immune antiserum, anti-spyM6_(—)0160 antiserum, and the change influorescence value between the pre-immune and anti-spyM6_(—)0160antiserum. TABLE 19 Summary of FACS values for surface expression ofspyM6_0160 2724 3650 2894 Pre- Anti- Pre- Anti- Pre- Anti- immunespyM6_0160 change immune spyM6_0160 change immune spyM6_0160 change117.12 443.24 326 128.57 776.39 648 125.87 621.17 495

Surface expression of M6_Spy0159 and M6_Spy0160 on M6 serotype GAS hasalso been confirmed by Western blot analysis. FIG. 98 shows that whilepre-immune sera (P α-0159) does not detect expression of M6_Spy0159 inGAS serotype M6, anti-M6_Spy0159 immune sera (I α-0159) is able todetect M6_Spy0159 protein in both total GAS M6 extracts (M6 tot) and GASM6 fractions enriched for cell surface proteins (M6 surf prot). TheM6_Spy0159 proteins detected in the total GAS M6 extracts or the GAS M6extracts enriched for surface proteins are also present as highmolecular weight structures, indicating that M6_Spy0159 may be in anoligomeric (pilus) form.

FIG. 112 shows that while preimmune sera (Preimmune Anti 106) does notdetect expression of M6_Spy0160 in GAS serotype M6 strain 2724,anti-M6_Spy0160 immune sera (Anti 160) does in both total GAS M6 strain2724 extracts (M6 2724 tot) and GAS M6 strain 2724 fractions enrichedfor surface proteins. The M6_Spy0160 proteins detected in the total GASM6 strain 2724 extracts or the GAS M6 strain 2724 extracts enriched forsurface proteins are also present as high molecular weight structures,indicating that M6_Spy0160 may be in an oligomeric (pilus) form.

FIGS. 110 and 111 both further verify the presence of M6_Spy0159 andM6_Spy0160 in higher molecular weight structures on the surface of GAS.FIG. 110 provides a Western blot performed to detect M6_Spy0159 andM6_Spy0160 in GAS M6 strain 2724 extracts enriched for surface proteins.Antiserum raised against either M6_Spy0159 (Anti-159) or M6_Spy0160(Anti-160) cross-hybridizes with high molecular weight structures (pili)in these extracts. FIG. 111 provides a similar Western blot thatverifies the presence of M6_Spy0159 and M6_Spy0160 in high molecularweight structures in GAS M6 strain 3650 extracts enriched for surfaceproteins.

SpyM6_(—)0157 (a fibronectin-binding protein) may also be expressed onthe surface of GAS serotype M6 bacteria. FIG. 174 shows the results ofFACS analysis for surface expression of spyM6_(—)0157 on M6 strain 3650.A slight shift in fluorescence is observed, which demonstrates that somespyM6_(—)0157 may be expressed on the GAS cell surface.

Adhesin Island Sequence within M6: GAS Adhesin Island 2 (“GAS AI-2”)

A GAS Adhesin Island within M1 serotype (SF370) is outlined in Table 5below. This GAS adhesin island 2 (“GAS AI-2”) comprises surfaceproteins, a SrtB sortase, a SrtC1 sortase and a RofA divergentlytranscribed transcriptional regulator.

GAS AI-2 surface proteins include GAS 15 (Cpa), Spy0128 (thought to be afimbrial protein) and Spy0130 (a hypothetical protein). Preferably, eachof these GAS AI-2 surface proteins includes an LPXTG sortase substratemotif, such as LPXTG (SEQ ID NO: 122), VVXTG (SEQ ID NO: 135), or EVXTG(SEQ ID NO: 136).

GAS AI-2 includes a srtB type sortase and a srtC1 sortase. As discussedabove, GAS SrtB sortases may preferably anchor surface proteins with anLPSTG (SEQ ID NO: 166) motif, particularly where the motif is followedby a serine. GAS SrtC1 sortase may preferentially anchor surfaceproteins with a V(P/V)PTG (SEQ ID NO: 167) motif. GAS SrtC1 may bedifferentially regulated by RofA.

GAS AI-2 may also include a LepA putative signal peptidase I protein.TABLE 5 GAS AI-2 sequence from M1 isolate (SF370) Sortase substrate AI-2sequence sequence or identifier sortase type functional descriptionSPy0124 rofA regulatory protein GAS15(not VVXTG cpa annotated in SF370)SPy0127 LepA putative signal peptidase I SPy0128 (GAS16) EVXTGhypothetical protein (fimbrial) SPy0129 (GAS17) srtC1 sortase SPy0130(GAS18) LPXTG hypothetical protein SPy0131 conserved hypotheticalprotein SPy0133 conserved hypothetical protein SPy0135 (GAS20) srtBsortase (putative fimbrial- associated protein)

GAS 15, GAS 16, and GAS 18 appear to be present on the surface of GAS aspart of oligomeric (pilus) structures. FIGS. 113-115 present electronmicrographs of GAS serotype M1 strain SF370 immunogold stained for GAS15 using anti-GAS 15 antiserum. FIGS. 116-121 provide electronmicrographs of GAS serotype M1 strain SF370 immunogold stained for GAS16 using anti-GAS 16 antiserum. FIGS. 122-125 present electronmicrograph of GAS serotype M1 strain SF370 immunogold stained for GAS 18using anti-GAS 18 antiserum. Oligomers of these proteins can be seen onthe surface of SF370 bacteria in the immuno-gold stained micrographs.

FIG. 126 reveals a hyperoligomer on the surface of a GAS serotype M1strain SF370 bacterium immunogold stained for GAS 18. This longhyperoliogmeric structure comprising GAS 18 stretches far out into thesupernatant from the surface of the bacteria.

FACS analysis has confirmed that the GAS AI-2 surface proteins GAS 15,GAS 16, and GAS 18 are expressed on the surface of GAS. FIG. 75 providesthe results of FACS analysis for surface expression of GAS 15 on each ofGAS serotypes M1 2719, M1 2580, M1 3280, M1 SF370, M1 2913, and M1 3348.A shift in fluorescence is observed for each GAS serotype when anti-GAS15 antiserum is present, demonstrating cell surface expression. Table20, below, quantitatively summarizes the FACS fluorescence valuesobtained for each GAS serotype in the presence of pre-immune antiserum,anti-GAS 15 antiserum, and the difference in fluorescence value betweenthe pre-immune and anti-GAS 15 antiserum. TABLE 20 Summary of FACSvalues for surface expression of GAS 15 Pre- Anti-GAS Pre- Anti-GAS Pre-Anti-GAS immune 15 Change immune 15 Change immune 15 Change 2719 25803280 159.46 712.71 553 123.9  682.84 559 217.02 639.69 423 SF370 29133348 201.93 722.68 521 121.41 600.45 479 152.09 446.41 294

FIGS. 76 and 79 provide the results of FACS analysis for surfaceexpression of GAS 16 on each of GAS serotypes M1 2719, M1 2580, M1 3280,M1 SF370, M1 2913, and M1 3348. The FACS data in FIG. 76 was obtainedusing antisera was raised against full length GAS 16. In the presence ofthis anti-GAS 16 antiserum, a shift in fluorescence is observed for eachGAS serotype, demonstrating its cell surface expression. Table 21,below, quantitatively summarizes the FACS fluorescence values obtainedfor each GAS serotype in the presence of pre-immune antiserum, anti-GAS16 antiserum, and the change in fluorescence value between thepre-immune and anti-GAS 16 antiserum. TABLE 21 Summary of FACS valuesfor surface expression of GAS 16 Pre- Anti-GAS Pre- Anti-GAS Pre-Anti-GAS immune 16 Change immune 16 Change immune 16 Change 2719 25803280 233.27 690.09 457 133.82 732.29 598 264.47 649.43 385 SF370 29133348 237.2 727.46 490 138.52 588.04 450 180.56 420.93 240

The FACS data in FIG. 79 was obtained using antisera was raised againsta truncated GAS 16, which is encoded by SEQ ID NO: 179, shown below. SEQID NO: 179: GCTACAACAGTTCACGGGGAGACTGTTGTAAACGGAGCCAAACTAACAGTTACAAAAAACCTTGATTTAGTTAATAGCAATGCATTAATTCCAAATACAGATTTTACATTTAAAATCGAACCTGATACTACTGTCAACGAAGACGGAAATAAGTTTAAAGGTGTAGCTTTGAACACACCGATGACTAAAGTCACTTACACCAATTCAGATAAAGGTGGATCAAATACGAAAACTGCAGAATTTGATTTTTCAGAAGTTACTTTTGAAAAACCAGGTGTTTATTATTACAAAGTAACTGAGGAGAAGATAGATAAAGTTCCTGGTGTTTCTTATGATACAACATCTTACACTGTTCAAGTTCATGTCTTGTGGAATGAAGAGCAACAAAAACCAGTAGCTACTTATATTGTTGGTTATAAAGAAGGTAGTAAGGTGCCAATTCAGTTCAAAAATAGCTTAGATTCTACTACATTAACGGTGAAGAAAAAAGTTTCAGGTACCGGTGGAGATCGCTCTAAAGATTTTAATTTTGGTCTGACTTTAAAAGCAAATCAGTATTATAAGGCGTCAGAAAAAGTCATGATTGAGAAGACAACTAAAGGTGGTCAAGCTCCTGTTCAAACAGAGGCTAGTATAGATCAACTCTATCATTTTACCTTGAAAGATGGTGAATCAATCAAAGTCACAAATCTTCCAGTAGGTGTGGATTATGTTGTCACTGAAGACGATTACAAATCAGAAAAATATACAACCAACGTGGAAGTTAGTCCTCAAGATGGAGCTGTAAAAAATATCGCAGGTAATTCAACTGAACAAGAGACATCTACTGATAAAGATATGACCATTACTT TTACAAATAAAAAAGATTT

In the presence of this anti-GAS 16 antiserum, a shift in fluorescenceis observed for each GAS serotype, demonstrating its cell surfaceexpression. Table 22, below, quantitatively summarizes the FACSfluorescence values obtained for each GAS serotype in the presence ofpre-immune antiserum, anti-GAS 16 antiserum, and the change influorescence value between the pre-immune and anti-GAS 16 antiserum.TABLE 22 Summary of FACS values for surface expression of GAS 16 using asecond antisera Pre- Anti-GAS Pre- Anti-GAS Pre- Anti-GAS immune 16Change immune 16 Change immune 16 Change 2719 2580 3280 141.55 650.22509 119.57 672.35 553 209.18 666.71 458 SF370 2913 3348 159.92 719.32559 115.97 585.9 470 146.1 414.01 268

FIGS. 77 and 78 provide the results of FACS analysis for surfaceexpression of GAS 18 on each of GAS serotypes M12719, M12580, M1 3280,M1 SF370, M12913, and M13348. The antiserum used to obtain the FACS datain each of FIGS. 77 and 78 was different, although each was raisedagainst full length GAS 18. In the presence of each of the anti-GAS 18antisera, a shift in fluorescence is observed for each GAS serotype,demonstrating its cell surface expression. Tables 23 and 24, below,quantitatively summarizes the FACS fluorescence values obtained for eachGAS serotype in the presence of pre-immune antiserum, first or secondanti-GAS 18 antiserum, and the change in fluorescence value between thepre-immune and first or second anti-GAS 18 antiserum. TABLE 23 Summaryof FACS values for surface expression of GAS 18 Pre- Anti-GAS Pre-Anti-GAS Pre- Anti-GAS immune 18 Change immune 18 Change immune 18Change 2719 2580 3280 135.68 327.98 192 116.32 379.41 263 208.12 380.84173 SF370 2913 3348 185.39 438.23 253 119.95 373.32 253 147.12 266.51119

TABLE 24 Summary of FACS values for surface expression of GAS 18 using asecond antisera Pre- Anti-GAS Pre- Anti-GAS Pre- Anti-GAS immune 18Change immune 18 Change immune 18 Change 2719 2580 3280 150.4 250.39 100139.18 386.38 247 253.38 347.72 94 SF370 2913 3348 188.64 373.11 184124.94 384.82 260 168.8 213.65 45

Surface expression of GAS 15, GAS 16, and GAS 18 on M1 serotype GAS hasalso been confirmed by Western blot analysis. FIG. 89 shows that whilepre-immune sera does not detect GAS M1 expression of GAS 15, anti-GAS 15immune sera is able to detect GAS 15 protein in both total GAS M1extracts and GAS M1 proteins enriched for cell surface proteins. The GAS15 proteins detected in the M1 extracts enriched for surface proteinsare also present as high molecular weight structures, indicating thatGAS 15 may be in an oligomeric (pilus) form. FIG. 90 also shows theresults of Western blot analysis of M1 serotype GAS using anti-GAS 15antisera. Again, the lanes that contain GAS M1 extracts enriched forsurface proteins (M1 prot sup) show the presence of high molecularweight structures that may be oligomers of GAS 15. FIG. 91 provides anadditional Western blot identical to that of FIG. 90, but that wasprobed with pre-immune sera. As expected, no proteins were detected onthis membrane.

FIG. 92 provides a Western blot that was probed for GAS 16 protein.While pre-immune sera does not detect GAS M1 expression of GAS 16,anti-GAS 16 immune sera is able to detect GAS 16 protein in GAS M1extracts enriched for cell surface proteins. The GAS 16 proteinsdetected in the M1 extracts enriched for surface proteins are present ashigh molecular weight structures, indicating that GAS 16 may be in anoligomeric (pilus) form. FIG. 93 also shows the results of Western blotanalysis of M1 serotype GAS using anti-GAS 16 antisera. The lanes thatcontain total GAS M1 protein (M1 tot new and M1 tot old) and the lanethat contains GAS M1 extracts enriched for surface proteins (M1 protsup) show the presence of high molecular weight structures that may beoligomers of GAS 16. FIG. 94 provides an additional Western blotidentical to that of FIG. 93, but that was probed with pre-immune sera.As expected, no proteins were detected on this membrane.

FIG. 95 provides a Western blot that was probed for GAS 18 protein.While pre-immune sera does not detect GAS M1 expression of GAS 18,anti-GAS 18 immune sera is able to detect GAS 18 protein in GAS M1extracts enriched for cell surface proteins. The GAS 18 proteinsdetected in the M1 extracts enriched for surface proteins are present ashigh molecular weight structures, indicating that GAS 18 may be in anoligomeric (pilus) form. FIG. 96 also shows the results of Western blotanalysis of M1 serotype GAS using anti-GAS 18 antisera. The lane thatcontains GAS M1 extracts enriched for surface proteins (M1 prot sup)show the presence of high molecular weight structures that may beoligomers of GAS 18. FIG. 97 provides an additional Western blotidentical to that of FIG. 96, but that was probed with pre-immune sera.As expected, no proteins were detected on this membrane.

FIGS. 102-106 provide additional Western blots to verify the presence ofGAS 15, GAS 16, and GAS 18 in high molecular weight structures in GAS.Each Western blot was performed using proteins from a different GAS M1strain, 2580, 2913, 3280, 3348, and 2719. Each Western blot was probedwith antisera raised against each of GAS 15, GAS 16, and GAS 18. As canbe seen in FIGS. 102-106, none of the Western blots shows detection ofproteins using pre-immune serum (Pα-158, Pα-15, Pα-16, or Pα-18), whileeach Western blot shows cross-hybridization of the GAS 15 (1α-15), GAS16 (1α-16), and GAS 18 (1α-18) antisera to high molecular weightstructures. Thus, these Western blots confirm that GAS 15, GAS 16, andGAS 18 can be present in pili in GAS M1.

FIG. 107 provides a similar Western blot performed to detect GAS 15, GAS16, and GAS 18 proteins in a GAS serotype M1 strain SF370 proteinfraction enriched for surface proteins. This Western blot also showsdetection of GAS 15 (Anti-15), GAS 16 (Anti-16), and GAS 18 (Anti-18) ashigh molecular weight structures.

(3) Adhesin Island Sequence within M3, M5 and M18: GAS Adhesin Island 3(“GAS AI-3”)

GAS Adhesin Island sequences within M3, M5, and M18 serotypes areoutlined in Tables 6-8 and 10 below. This GAS adhesin island 3 (“GASAI-3”) comprises surface proteins, a SrtC2 sortase, and a Negativetranscriptional regulator (Nra) divergently transcribed transcriptionalregulator.

GAS AI-3 surface proteins within include a collagen binding protein, afimbrial protein, a F2 like fibronectin-binding protein. GAS AI-3surface proteins may also include a hypothetical surface protein.Preferably, each of these GAS AI-3 surface proteins include an LPXTGsortase substrate motif, such as LPXTG (SEQ ID NO: 122), VPXTG (SEQ IDNO: 137), QVXTG (SEQ ID NO: 138) or LPXAG (SEQ ID NO: 139).

GAS AI-3 includes a SrtC2 type sortase. GAS SrtC2 type sortases maypreferably anchor surface proteins with a QVPTG (SEQ ID NO: 140) motif,particularly when the motif is followed by a hydrophobic region and acharged C terminus tail. GAS SrtC2 may be differentially regulated byNra.

GAS AI-3 may also include a LepA putative signal peptidase I protein.

GAS AI-3 may also include a putative multiple sugar metabolismregulator. TABLE 6 GAS AI-3 sequences from M3 isolate (MGAS315) Sortasesubstrate AI-3 sequence sequence or identifier sortase type Functionaldescription SpyM3_0097 Negative transcriptional regulator (Nra)SpyM3_0098 VPXTG putative collagen binding protein (Cpb) SpyM3_0099 LepAputative signal peptidase I SpyM3_0100 QVXTG conserved hypotheticalprotein (fimbrial) SpyM3_0101 SrtC2 sortase SpyM3_0102 LPXAGhypothetical protein SpyM3_0103 putative multiple sugar metabolismregulator SpyM3_0104 LPXTG protein F2 like fibronectin-binding protein

TABLE 7 GAS AI-3 sequence from M3 isolate (SSI-1) Sortase Substrate AI-3sequence sequence or identifier sortase type Functional descriptionSPs0099 Negative transcriptional regulator (Nra) SPs0100 VPXTG putativecollagen binding protein (Cpb) SPs0101 LepA putative signal peptidase ISPs0102 QVXTG conserved hypothetical protein (fimbrial) SPs0103 SrtC2sortase SPs0104 LPXAG hypothetical protein SPs0105 putative multiplesugar metabolism regulator SPs0106 LPXTG protein F2 likefibronectin-binding protein

TABLE 10 GAS AI-3 sequences from M5 isolate (Manfredo) AI-3 Sortasesubstrate sequence sequence or identifier sortase type Functionaldescription orf77 Negative transcriptional regulator (Nra) orf78 VPXTGputative collagen binding protein (Cpb) orf79 LepA putative signalpeptidase I orf80 QVXTG conserved hypothetical protein (fimbrial) orf81SrtC2 sortase orf82 LPXAG hypothetical protein orf83 putative multiplesugar metabolism regulator orf84 LPXTG protein F2 likefibronectin-binding protein

TABLE 8 GAS AI-3 sequences from M18 isolate (MGAS8232) Sortase substrateAI-3 sequence sequence or identifier sortase type Functional descriptionspyM18_0125 Negative transcriptional regulator (Nra) (N-terminalfragment) spyM18_0126 VPXTG putative collagen binding protein (Cpb)spyM18_0127 LepA putative signal peptidase I spyM18_0128 QVXTG conservedhypothetical protein (fimbrial) spyM18_0129 SrtC2 sortase spyM18_0130LPXAG hypothetical protein spyM18_0131 putative multiple sugarmetabolism regulator spyM18_0132 LPXTG protein F2 likefibronectin-binding protein

TABLE 44 GAS AI-3 sequences from M49 isolate (591) Sortase substrateAI-3 sequence sequence or identifier sortase type Functional descriptionSpyoM01000156 Negative transcriptional regulator (Nra) SpyoM01000155VPXTG collagen binding protein (Cpa) SpyoM01000154 LepA putative signalpeptidase I SpyoM01000153 QVXTG conserved hypothetical protein(fimbrial) SpyoM01000152 SrtC2 sortase SpyoM01000151 LPXAG hypotheticalprotein SpyoM01000150 MsmRL SpyoM01000149 LPXTG protein F2 likefibronectin- binding protein

A schematic of AI-3 serotypes M3, M5, M 18, and M49 is shown in FIG.51A. Each contains an open reading frame encoding a SrtC2-type sortaseof nearly identical amino acid sequence. See FIG. 52B for an amino acidsequence alignment for each of the SrtC2 amino acid sequences.

The protein F2-like fibronectin-binding protein of each these type 3adhesin islands contains a pilin motif and an E-box. FIG. 60 indicatesthe amino acid sequence of the pilin motif and E-box of each of GAS AI-3serotype M3 MGAS315 (SpyM3_(—)0104/21909640), GAS AI-3 serotype M3 SSI(Sps0106/28895018), GAS AI-3 serotype M18 (SpyM18_(—)0132/19745307), andGASAI-3 serotype M5 (orf84).

FACS analysis has confirmed that the GAS AI-3 surface proteinsSpyM3_(—)0098, SpyM3_(—)0100, SpyM3_(—)0102, and SpyM3_(—)0104 areexpressed on the surface of GAS. FIG. 80 provides the results of FACSanalysis for surface expression of SpyM3_(—)0098 on each of GASserotypes M3 2721 and M3 3135. A shift in fluorescence is observed foreach GAS serotype when anti-SpyM3_(—)0098 antiserum is present,demonstrating cell surface expression. Table 25, below, quantitativelysummarizes the FACS fluorescence values obtained for each GAS serotypein the presence of pre-immune antiserum, anti-SpyM3_(—)0098 antiserum,and the difference in fluorescence value between the pre-immune andanti-SpyM3_(—)0098 antiserum. TABLE 25 Summary of FACS values forsurface expression of SpyM3_0098 2721 3135 Pre- Anti- Pre- Anti- immunespyM3_0098 Change immune spyM3_0098 Change 117.85 249.51 132 99.17277.21 178

FIG. 81 provides the results of FACS analysis for surface expression ofSpyM3_(—)0100 on each of GAS serotypes M3 2721 and M3 3135. A shift influorescence is observed for each GAS serotype when anti-SpyM3_(—)0100antiserum is present, demonstrating cell surface expression. Table 26,below, quantitatively summarizes the FACS fluorescence values obtainedfor each GAS serotype in the presence of pre-immune antiserum,anti-SpyM3_(—)0100 antiserum, and the difference in fluorescence valuebetween the pre-immune and anti-SpyM3_(—)0100 antiserum. TABLE 26Summary of FACS values for surface expression of SpyM3_0100 2721 3135Pre- Anti- Pre- Anti- immune spyM3_0100 Change immune spyM3_0100 Change110.31 181.91 72 97.87 250.01 152

FIG. 82 provides the results of FACS analysis for surface expression ofSpyM3_(—)0102 on each of GAS serotypes M3 2721 and M3 3135. A shift influorescence is observed for each GAS serotype when anti-SpyM3_(—)0102antiserum is present, demonstrating cell surface expression. Table 27,below, quantitatively summarizes the FACS fluorescence values obtainedfor each GAS serotype in the presence of pre-immune antiserum,anti-SpyM3_(—)0102 antiserum, and the difference in fluorescence valuebetween the pre-immune and anti-SpyM3_(—)0102 antiserum. TABLE 27Summary of FACS values for surface expression of SpyM3_0102 in M3serotypes 2721 3135 Pre- Anti- Pre- Anti- immune spyM3_0102 Changeimmune spyM3_0102 Change 109.86 155.26 45 100.02 112.58 13

FIG. 82 also provides the results of FACS analysis for surfaceexpression of a pilin antigen that has homology to SpyM3_(—)0102identified in a different GAS serotype, M6. FACS analysis conducted withthe SpyM3_(—)0102 antisera was able to detect surface expression of thehomologous SpyM3_(—)0102 antigen on each of GAS serotypes M6 2724, M63650, and M6 2894. Table 28, below, quantitatively summarizes the FACSfluorescence values obtained for each GAS serotype in the presence ofpre-immune antiserum, anti-SpyM3_(—)0102 antiserum, and the differencein fluorescence value between the pre-immune and anti-SpyM3_(—)0102antiserum. TABLE 28 Summary of FACS values for surface expression ofSpyM3_0102 in M6 serotypes 2724 3650 2894 Pre- Anti- Pre- Anti- Pre-Anti- immune spyM3_0102 Change immune spyM3_0102 Change immunespyM3_0102 Change 146.59 254.03 107 162.56 294.03 131 175.49 313.69 138

SpyM3_(—)0102 is also homologous to pilin antigen 19224139 of GASserotype M12. Antisera raised against SpyM3_(—)0102 is able to detecthigh molecular weight structures in GAS serotype M12 strain 2728 proteinfractions enriched for surface proteins, which would contain the19224139 antigen. See FIG. 109 at the lane labelled M112 2728 surf prot.

FIG. 83 provides the results of FACS analysis for surface expression ofSpyM3_(—)0104 on each of GAS serotypes M3 2721 and M3 3135. A shift influorescence is observed for each GAS serotype when anti-SpyM3_(—)0104antiserum is present, demonstrating cell surface expression. Table 29,below, quantitatively summarizes the FACS fluorescence values obtainedfor each GAS serotype in the presence of pre-immune antiserum,anti-SpyM3_(—)0104 antiserum, and the difference in fluorescence valuebetween the pre-immune and anti-SpyM3_(—)0104 antiserum. TABLE 29Summary of FACS values for surface expression of SpyM3_0104 in M3serotypes 2721 3135 Pre- Anti- Pre- Anti- immune spyM3_0104 Changeimmune spyM3_0104 Change 128.45 351.65 223 105.1 339.88 235

FIG. 83 also provides the results of FACS analysis for surfaceexpression of a pilin antigen that has homology to SpyM3_(—)0104identified in a different GAS serotype, M12. FACS analysis conductedwith the SpyM3_(—)0104 antisera was able to detect surface expression ofthe homologous SpyM3_(—)0104 antigen on GAS serotype M12 2728. Table 30,below, quantitatively summarizes the FACS fluorescence values obtainedfor this GAS serotype in the presence of pre-immune antiserum,anti-SpyM3_(—)0104 antiserum, and the difference in fluorescence valuebetween the pre-immune and anti-SpyM3_(—)0104 antiserum. TABLE 30Summary of FACS values for surface expression of SpyM3_0104 in an M12serotype 2728 Pre-immune Anti-spyM3_0104 Change 198.57 288.75 90

FIG. 84 provides the results of FACS analysis for surface expression ofSPs_(—)0106 on each of GAS serotypes M3 2721 and M3 3135. A shift influorescence is observed for each GAS serotype when anti-SPs_(—)0106antiserum is present, demonstrating cell surface expression. Table 31,below, quantitatively summarizes the FACS fluorescence values obtainedfor each GAS serotype in the presence of pre-immune antiserum,anti-SPs_(—)0106 antiserum, and the difference in fluorescence valuebetween the pre-immune and anti-SPs_(—)0106 antiserum. TABLE 31 Summaryof FACS values for surface expression of SPs_0106 in M3 serotypes 27213135 Anti- Anti- Pre-immune SPs_0106 Change Pre-immune SPs_0106 Change116 463.28 347 103.02 494.27 391

FIG. 84 also provides the results of FACS analysis for surfaceexpression of a pilin antigen that has homology to SPs_(—)0106identified in a different GAS serotype, M12. FACS analysis conductedwith the SPs_(—)0106 antisera was able to detect surface expression ofthe homologous SPs_(—)0106 antigen on GAS serotype M12 2728. Table 32,below, quantitatively summarizes the FACS fluorescence values obtainedfor each GAS serotype in the presence of pre-immune antiserum,anti-SPs_(—)0106 antiserum, and the difference in fluorescence valuebetween the pre-immune and anti-SPs_(—)0106 antiserum. TABLE 32 Summaryof FACS values for surface expression of SPs_0106 in an M12 serotype2728 Pre-immune Anti-SPs_0106 Change 304.01 254.64 −49

(4) Adhesin Island Sequence within M12: GAS Adhesin Island 4 (“GASAI-4”)

GAS Adhesin Island sequences within M12 serotype are outlined in Table11 below. This GAS adhesin island 4 (“GAS AI-4”) comprises surfaceproteins, a SrtC2 sortase, and a RofA regulatory protein.

GAS AI-4 surface proteins within may include a fimbrial protein, an F orF2 like fibronectin-binding protein, and a capsular polysaccharideadhesion protein (Cpa). GAS AI-4 surface proteins may also include ahypothetical surface protein in an open reading frame (orf). Preferably,each of the GAS AI-4 surface proteins include an LPXTG sortase substratemotif, such as LPXTG (SEQ ID NO: 122), VPXTG (SEQ ID NO: 137), QVXTG(SEQ ID NO: 138) or LPXAG (SEQ ID NO: 139).

GAS AI-4 includes a SrtC2 type sortase. GAS SrtC2 type sortases maypreferably anchor surface proteins with a QVPTG (SEQ ID NO: 140) motif,particularly when the motif is followed by a hydrophobic region and acharged C terminus tail.

GAS AI-4 may also include a LepA putative signal peptidase I protein anda MsmRL protein. TABLE 11 GAS AI-4 sequences from M12 isolate (A735)Sortase substrate AI-4 sequence sequence or sortase identifier typeFunctional description 19224133 RofA regulatory protein 19224134 LPXTGprotein F SrtB SrtB (stop codon*) 19224135 VPXTG Cpa 19224136 LepA19224137 QVXTG EftLSL.A (fimbrial) 19224138 SrtC2 EftLSL.B 19224139LPXAG Orf2 19224140 MsmRL 19224141 LPXTG protein F2

A schematic of AI-4 serotype M12 is shown in FIG. 51A.

One of the open reading frames encodes a SrtC2-type sortase having anamino acid sequence nearly identical to the amino acid sequence of theSrtC2-type sortase of the AI-3 serotypes described above. See FIG. 52Bfor an amino acid sequence alignment for each of the SrtC2 amino acidsequences.

Other proteins encoded by the open reading frames of the AI-4 serotypeM12 are homologous to proteins encoded by other known adhesin islands inS. pyogenes, as well as the GAS AI-3 serotype M5 (Manfredo). FIG. 52 isan amino acid alignment of the capsular polysaccharide adhesion protein(cpa) of AI-4 serotype M112 (19224135), GAS AI-3 serotype M5 (ORF78), S.pyogenes strain MGAS315 serotype M3 (21909634), S. pyogenes SSI-1serotype M3 (28810257), S. pyogenes MGAS8232 serotype M3 (19745301), andGAS AI-2 serotype M1 (GAS15). The amino acid sequence of the AI-4serotype M12 cpa shares a high degree of homology with other cpaproteins.

FIG. 53 shows that the F-like fibronectin-binding protein encoded by theAI-4 serotype M12 open frame (19224134) shares homology with a F-likefibronectin-binding protein found in S. pyogenes strain MGAS10394serotype M6 (50913503).

FIG. 54 is an amino acid sequence alignment that illustrates that theF2-like fibronectin-binding protein of AI-4 serotype M12 (19224141)shares homology with the F2-like fibronectin-binding protein of S.pyogenes strain MGAS8232 serotype M3 (19745307), GAS AI-3 serotype M5(ORF84), S. pyogenes strain SSI serotype M3 (28810263), and S. pyogenesstrain MGAS315 serotype M3 (21909640).

FIG. 55 is an amino acid sequence alignment that illustrates that thefimbrial protein of AI-4 serotype M12 (19224137) shares homology withthe fimbrial protein of GAS AI-3 serotype M5 (ORF80), and thehypothetical protein of S. pyogenes strain MGAS315 serotype M3(21909636), S. pyogenes strain SSI serotype M3 (28810259), S. pyogenesstrain MGAS8732 serotype M3 (19745303), and S. pyogenes strain M1 GASserotype M1 (13621428).

FIG. 56 is an amino acid sequence alignment that illustrates that thehypothetical protein of GAS AI-4 serotype M12 (19224139) shares homologywith the hypothetical protein of S. pyogenes strain MGAS315 serotype M3(21909638), S. pyogenes strain SSI-1 serotype M3 (28810261), GAS AI-3serotype M5 (ORF82), and S. pyogenes strain MGAS8232 serotype M3(19745305).

The protein F2-like fibronectin-binding protein of the type 4 adhesinisland also contains a highly conserved pilin motif and an E-box. FIG.60 indicates the amino acid sequence of the pilin motif and E-box inAI-4 serotype M12.

FACS analysis has confirmed that the GAS AI-4 surface proteins 19224134,19224135, 19224137, and 19224141 are expressed on the surface of GAS.FIG. 85 provides the results of FACS analysis for surface expression of19224134 on GAS serotype M12 2728. A shift in fluorescence is observedwhen anti-19224134 antiserum is present, demonstrating cell surfaceexpression. Table 33, below, quantitatively summarizes the FACSfluorescence values obtained for GAS serotype M12 2728 in the presenceof pre-immune antiserum, anti-19224134 antiserum, and the difference influorescence value between the pre-immune and anti-19224134 antiserum.TABLE 33 Summary of FACS values for surface expression of 19224134 in anM12 serotype 2728 Pre-immune Anti-19224134 Change 137.8 485.32 348

FIG. 85 also provides the results of FACS analysis for surfaceexpression of a pilin antigen that has homology to 19224134 identifiedin a different GAS serotype, M6. FACS analysis conducted with the19224134 antisera was able to detect surface expression of thehomologous 19224134 antigen on each of GAS serotypes M6 2724, M6 3650,and M6 2894. Table 34, below, quantitatively summarizes the FACSfluorescence values obtained for each GAS serotype in the presence ofpre-immune antiserum, anti-19224134 antiserum, and the difference influorescence value between the pre-immune and anti-19224134 antiserum.TABLE 34 Summary of FACS values for surface expression of 19224134 in M6serotypes 2724 3650 2894 Pre- Anti- Pre- Anti- Pre- Anti- immune19224134 Change immune 19224134 Change immune 19224134 Change 123.58264.59 141 140.82 262.64 122 135.4 307.25 172

FIG. 86 provides the results of FACS analysis for surface expression of19224135 on GAS serotype M12 2728. A shift in fluorescence is observedwhen anti-19224135 antiserum is present, demonstrating cell surfaceexpression. Table 35, below, quantitatively summarizes the FACSfluorescence values obtained for GAS serotype M12 2728 in the presenceof pre-immune antiserum, anti-19224135 antiserum, and the difference influorescence value between the pre-immune and anti-19224135 antiserum.TABLE 35 Summary of FACS values for surface expression of 19224135 in anM12 serotype 2728 Pre-immune Anti-19224135 Change 151.38 471.95 321

FIG. 87 provides the results of FACS analysis for surface expression of19224137 on GAS serotype M12 2728. A shift in fluorescence is observedwhen anti-19224137 antiserum is present, demonstrating cell surfaceexpression. Table 36, below, quantitatively summarizes the FACSfluorescence values obtained for GAS serotype M12 2728 in the presenceof pre-immune antiserum, anti-19224137 antiserum, and the difference influorescence value between the pre-immune and anti-19224137 antiserum.TABLE 36 Summary of FACS values for surface expression of 19224137 in anM12 serotype 2728 Pre-immune Anti-19224137 Change 140.44 433.25 293

FIG. 88 provides the results of FACS analysis for surface expression of19224141 on GAS serotype M12 2728. A shift in fluorescence is observedwhen anti-19224141 antiserum is present, demonstrating cell surfaceexpression. Table 37, below, quantitatively summarizes the FACSfluorescence values obtained for GAS serotype M12 2728 in the presenceof pre-immune antiserum, anti-19224141 antiserum, and the difference influorescence value between the pre-immune and anti-19224141 antiserum.TABLE 37 Summary of FACS values for surface expression of 19224141 in anM12 serotype 2728 Pre-immune Anti-19224141 Change 147.02 498 351

19224139 (designated as orf2) may also be expressed on the surface ofGAS serotype M12 bacteria. FIG. 175 shows the results of FACS analysisfor surface expression of 19224139 on M12 strain 2728. A slight shift influorescence is observed, which demonstrates that some 19224139 may beexpressed on the GAS cell surface.

Surface expression of 19224135 on M12 serotype GAS has also beenconfirmed by Western blot analysis. FIG. 99 shows that while pre-immunesera (P α-4135) does not detect GAS M12 expression of 19224135,anti-19224135 immune sera (I α-4135) is able to detect 19224135 proteinin both total GAS M12 extracts (M12 tot) and GAS M12 fractions enrichedfor cell surface proteins (M12 surf prot). The 19224135 proteinsdetected in the total GAS M12 extracts or the GAS M12 extracts enrichedfor surface proteins are also present as high molecular weightstructures, indicating that 19224135 may be in an oligomeric (pilus)form. See also FIG. 108, which provides a further Western blot showingthat anti-19224135 antiserum (Anti-19224135) immunoreacts with highmolecular weight structures in GAS M12 strain 2728 protein extractsenriched for surface proteins.

Surface expression of 19224137 on M12 serotype GAS has also beenconfirmed by Western blot analysis. FIG. 100 shows that while pre-immunesera (P α-4137) does not detect GAS M12 expression of 19224137,anti-19224137 immune sera (I α-4137) is able to detect 19224137 proteinin both total GAS M12 extracts (M12 tot) and GAS M12 fractions enrichedfor cell surface proteins (M12 surf prot). The 19224137 proteinsdetected in the total GAS M12 extracts or the GAS M12 extracts enrichedfor surface proteins are also present as high molecular weightstructures, indicating that 19224137 may be in an oligomeric (pilus)form. See also FIG. 108, which provides a further Western blot showingthat anti-19224137 antiserum (Anti-19224137) immunoreacts with highmolecular weight structures in GAS M12 strain 2728 protein extractsenriched for surface proteins.

Streptococcus pneumoniae

Adhesin island sequences can be identified in Streptococcus pneumoniaegenomes. Several of these genomes include the publicly availableStreptococcus pneumoniae TIGR4 genome or Streptococcus pneumoniae strain670 genome. Examples of these S. pneumoniae AI sequence are discussedbelow.

S. pneumoniae Adhesin Islands generally include a series of open readingframes within a S. pneumoniae genome that encode for a collection ofsurface proteins and sortases. A S. pneumoniae Adhesin Island may encodefor amino acid sequences comprising at least one surface protein.Alternatively, an S. pneumoniae Adhesin Island may encode for at leasttwo surface proteins and at least one sortase. Preferably, a S.pneumoniae Adhesin Island encodes for at least three surface proteinsand at least two sortases. One or more of the surface proteins mayinclude an LPXTG motif (such as LPXTG (SEQ ID NO: 122)) or other sortasesubstrate motif. One or more S. pneumoniae AI surface proteins mayparticipate in the formation of a pilus structure on the surface of theS. pneumoniae bacteria.

S. pneumoniae Adhesin Islands of the invention preferably include adivergently transcribed transcriptional regulator. The transcriptionalregulator may regulate the expression of the S. pneumoniae AI operon.

The S. pneumoniae AI surface proteins may bind or otherwise adhere tofibrinogen, fibronectin, or collagen.

A schematic of the organization of a S. pneumoniae AI locus is providedin FIG. 137. The locus comprises open reading frames encoding atranscriptional regulator (rlrA), cell wall surface proteins (rrgA,rrgB, rrgc), and sortases (srtB, srtC, srtD). FIG. 137 also indicatesthe S. pneumoniae strain TIGR4 gene name corresponding to each of theseopen reading reading frames.

Tables 9 and 38 identify the genomic location of each of these openreading frames in S. pneumoniae strains TIGR4 and 670, respectively.TABLE 9 S. pneumoniae AI sequences from TIGR4 Synonym (AI GenomicLocation Strand Length PID Sequence Identifier) Functional description436302 . . . 437831 − 509 15900377 SP0461 transcriptional regulator438326 . . . 441007 + 893 15900378 SP0462 cell wall surface anchorfamily protein 441231 . . . 443228 + 665 15900379 SP0463 cell wallsurface anchor family protein 443275 . . . 444456 + 393 15900380 SP0464cell wall surface anchor family protein 444675 . . . 444806 − 4315900381 SP0465 hypothetical protein 444857 . . . 445696 + 279 15900382SP0466 sortase 445791 . . . 446576 + 261 15900383 SP0467 sortase 446563. . . 447414 + 283 15900384 SP0468 sortase

TABLE 38 S. pneumoniae strain 670 AI sequences AI Sequence GenomicLocation Strand Identifier\ Functional description  4383-5645 − Orf1_670IS1167, transposase  5910-7439 − Orf2_670 transcriptional regulator,putative  7934-10606 + Orf3_670 cell wall surface anchor family protein10839-12773 + Orf4_670 cell wall surface anchor family protein12796-14001 + Orf5_670 cell wall surface anchor family protein14327-15241 + Orf6_670 sortase, putative 15336-16121 + Orf7_670 sortase,putative 16108-16959 + Orf8_670 sortase, putative

The full-length nucleotide sequence of the S. pneumoniae strain 670 μlis also shown in FIG. 101, as is its translated amino acid sequence.

At least eight other S. pneumoniae strains contain an adhesin islandlocus described by the locus depicted in FIG. 137. These strains wereidentified by an amplification analysis. The genomes of different S.pneumoniae strains were amplified with eleven separate sets of primers.The sequence of each of these primers is provided below in Table 41.TABLE 41 Sequences of primers used to amplify AI locus Primer PairForward Primer Sequence Reverse Primer Sequence 1ACTTTCTAATGAGTTGTTTAGGCG AGCGACAAGCCACTGTATCATATT 2CTGGTCGATAACTCCTTCAATCTT GTACGACAAAAGTGTGGCTTGTT 3GAATGCGATATTCAGGACCAACTA ATCTCACTGAGTTAATCCGTTCAC 4TGTATACAAGTGTGTCATTGCCAG CATCTTCACCTGTTCTCACATTTT 5GCGGTCTTTAGTCTTCAAAAACA CAAGAGAAAAACACAGAGCCATAA 6TTGCTTAAGTAAGAGAGAAAGGAGC CAGGAGTATAGTGTCCGCTTTCTT 7GGCAATGTTGACTTTATGAAGGTG TATCAGCATCCCTTTATCTTCAAAC 8TGAGATTTTCTCGTTTCTCTTAGC AATAGACGATGGGTATTGATCATGT 9CCGACGAACTTTGATGATTTATTG ACCAACAGACGATGACTGTTAATC 10AATGACTTTGAGCCTGTCTTGAT TTCTACAATTTCCTGGCCATTATC 11GCCATTTGGATCAGCTAAAAGTT TTTTTCAACCCACTACAGTTGACAThese primers hybridized along the entire length of the AI locus togenerate amplification products representative of sequences throughoutthe locus. See FIG. 138, which is a schematic of the location where eachof these primers hybridizes to the S. pneumoniae AI locus. FIG. 139Aprovides the set of amplicons obtained from amplification of the AIlocus in S. pneumoniae strain TIGR4. FIG. 139B provides the length, inbase pairs, of each amplicon in S. pneumoniae strain TIGR4.Amplification of the genome of S. pneumoniae strains 19A Hungary 6, 6BFinland 12, 6B Spain 2, 9V Spain 3, 14 CSR 10, 19F Taiwan 14, 23F Taiwan15, and 23F Poland 16 produced a set of eleven amplicons for the elevenprimer pairs, indicating that each of these strains also contained theS. pneumoniae AI locus.

The S. pneumoniae strains were also identified as containing the AIlocus by comparative genome hybridization (CGH) analysis. The genomes ofsixteen S. pneumoniae strains were interrogated for the presence of theAI locus by comparison to unique open reading frames of strain TIGR4.The AI locus was detected by this method in strains 19A Hungary 6(19AHUN), 6B Finland 12 (6BFIN12), 6B Spain 2 (6BSP2), 14CSR10 (14CSR10), 9V Spain 3 (9VSP3), 19F Taiwan 14 (19FTW14), 23F Taiwan 15(19FTW15), and 23F Poland 16 (23FP16). See FIG. 140.

The AI locus has been sequenced for each of these strains and thenucleotide and encoded amino acid seqeunce for each orf has beendetermined. An alignment of the complete nucleotide sequence of theadhesin island present in each of the ten strains is provided in FIG.196. Aligning the amino acid sequences encoded by the orfs revealsconservation of many of the AI polypeptide amino acid sequences. Forexample, Table 39 provides a comparison of the percent identities of thepolypeptides encoded within the S. pneumoniae strain 670 and TIGR4adhesin islands. TABLE 39 Pecent identity comparison of S. pneumoniaestrains AI sequences S. pneumoniae S. pneumoniae strain 670 from TIGR4polypeptide polypeptide Shared identity of polypeptides Orf1_670 SP0460 99.3% identity in 422 aa overlap Orf2_670 SP0461 100.0% identity in 509aa overlap Orf3_670 SP0462  83.2% identity in 895 aa overlap Orf4_670SP0463  47.9% identity in 678 aa overlap Orf5_670 SP0464  99.7% identityin 393 aa overlap Orf6_670 SP0466 100.0% identity in 279 aa overlapOrf7_670 SP0467  94.2% identity in 260 aa overlap Orf8_670 SP0468  91.5%identity in 283 aa overlapFIGS. 141-147 each provide a multiple sequence alignment for thepolypeptides encoded by one of the open reading frames in all tenAI-positive S. pneumoniae strains. In each of the sequence alignments,light shading indicates an LPXTG motif and dark shading indicates thepresence of an E-box motif with the conserved glutamic acid residue ofthe E-box motif in bold.

The sequence alignments also revealed that the polypeptides encoded bymost of the open reading frames may be divided into two groups ofhomology, S. pneumoniae AI-a and AI-b. S. pneumoniae strains thatcomprise AI-a include 14 CSR 10, 19A Hungary 6, 23F Poland 15, 670, 6BFinland 12, and 6B Spain 2. S. pneumoniae strains that comprise AI-binclude 19F Taiwan 14, 9V Spain 3, 23F Taiwan 15, and TIGR4. Animmunogenic composition of the invention may comprise one or morepolypeptides from within each of S. pneumoniae AI-a and AI-b. Forexample, polypeptide RrgB, encoded by open reading frame 4, may bedivided within two such groups of homology. One group contains the RrgBsequences of six S. pneumoniae strains and a second group contains theRrgB sequences of four S. pneumoniae strains. While the amino acidsequence of the strains within each individual group is 99-100 percentidentical, the amino acid sequence identity of the strains in the firstrelative to the second group is only 48%. Table 41 provides the identitycomparisons of the amino acid sequences encoded by each open readingframe for the ten S. pneumoniae strains. TABLE 42 Conservation of aminoacid sequences encoded by the S. pneumoniae AI locus % Identity %Identity Putative Role Encoded Groups of in Between of Polypeptide byOrf Homology Group Groups RlrA, 2 1 group (10 strains) 100 —transcriptional regulator RrgA, cell 3 2 groups (6 + 4) 98-100 83 wallsurface protein RrgB, cell 4 2 groups (6 + 4) 99-100 48 wall surfaceprotein RrgC, cell 5 2 groups (6 + 4) 99-100 97 wall surface proteinSrtB, putative 6 2 groups (7 + 3) 99-100 97 sortase SrtC, putative 7 2groups (6 + 4) 95-100 93 sortase SrtD, putative 8 2 groups (6 + 4)99-100 92 sortase

The division of homology between the RrgB polypeptide in the S.pneumoniae strains is due a lack of amino acid sequence identity in thecentral amino acid residues. Amino acid residues 1-30 and 617-665 areidentical for each of the ten S. pneumoniae strains. However, amino acidresidues 31-616 share between 42 and 100 percent identity betweenstrains. See FIG. 149. The shared N- and C-terminal regions of identityin the RrgB polypeptides may be preferred portions of the RrgBpolypeptide for use in an immunogenic composition. Similarly, sharedregions of identity in any of the polypeptides encoded by the S.pneumoniae AI locus may be preferable for use in immunogeniccompositions. One of skill in the art, using the amino acid alignmentsprovided in FIGS. 141-147, would readily be able to determine theseregions of identity.

The S. pneumoniae comprising these AI loci do, in fact, express highmolecular weight polymers on their surface, indicating the presence ofpili. See FIG. 182, which shows detection of high molecular weightstructures expressed by S. pneumoniae strains that comprise the adhesinisland locus depicted in FIG. 137, these strains are indicated as rlrA+.Confirming these findings, electron microscopy and negative stainingdetects the presence of pili extending from the surface of S.pneumoniae. See FIG. 185. To demonstrate that the adhesin island locuswas responsible for the pili, the rrgA-srtD region of TIGR 4 weredeleted. Deletion of this region of the adhesin island resulted in aloss of pili expression. See FIG. 186. See also FIG. 235, which providesan electron micrograph of S. pneumoniae lacking the rrgA-srtD regionimmunogold stained using anti-RrgB and anti-RrgC antibodies. No pili canbe seen. Similarly to that described above, a S. pneumoniae bacteriathat lacks a transcriptional repressor, mgrA, of genes in the adhesinisland expresses pili. See FIG. 187. However, and as expected, a S.pneumoniae bacteria that lacks both the mgrA and adhesin island genes inthe rrgA-srtD region does not express pili. See FIG. 188.

These high molecular weight pili structures appear to play a role inadherence of S. pneumoniae to cells. S. pneumoniae TIGR4 that lack thepilus operon have significantly diminished ability to adhere to A549Alveolar cells in vitro. See FIG. 184.

The Sp0463 (S. pneumoniae TIGR4 rrgB) adhesion island polypeptide isexpressed in oligomeric form. Whole cell extracts were analyzed byWestern blot using a Sp0463 antiserum. The antiserum cross-hybridizedwith high molecular weight Sp0463 polymers. See FIG. 156. The antiserumdid not cross-hybridize with polypeptides from D39 or R6 strains of S.pneumoniae, which do not contain the AI locus depicted in FIG. 137.Immunogold labelling of S. pneumoniae TIGR 4 using RrgB antiserumconfirms the presence of RrgB in pili. FIG. 189 shows double-labeling ofS. pneumoniae TIGR 4 bacteria with immunolabeling for RrgB (5 nm goldparticles) and RrgC (10 nm gold particles) protein. The RrgB protein isdetected as present at intervals along the pilus structure. The RrgCprotein is detected at the tips of the pili. See FIG. 234 at arrows;FIG. 234 is a close up of a pilus in FIG. 189 at the location indicatedby *.

The RrgA protein appears to be present in and necessary for formation ofhigh molecular weight structures on the surface of S. pneumoniae TIGR4.See FIG. 181 which provides the results of Western blot analysis ofTIGR4 S. pneumoniae lacking the gene encoding RrgA. No high molecularweight structures are detected in S. pneumoniae that do not express RrgAusing antiserum raised against RrgB. See also FIG. 183.

A detailed diagram of the amino acid sequence comparions of the RrgAprotein in the ten S. pneumoniae strains is shown in FIG. 148. Thediagram reveals the division of the individual S. pneumoniae strainsinto the two different homology groups.

The cell surface polypeptides encoded by the S. pneumoniae TIGR4 AI,Sp0462 (rrgA), Sp0463 (rrgB), and Sp0464 (rrgC), have been cloned andexpressed. See examples 15-17. A polyacrylamide gel showing successfulrecombinant expression of RrgA is provided in FIG. 190A. Detection ofthe RrgA protein, which is expressed in pET21b with a histidine tag, isalso shown by Western blot analysis in FIG. 190B, using ananti-histidine tag antibody.

Antibodies that detect RrgB and RrgC antibodies have been produced inmice. See FIGS. 191 and 192, which show detection of RrgB and RrgC,respectively, using the raised antibodies.

In addition to the identification of these S. pneumoniae adhesionislands, coding sequences for SrtB type sortases have been identified inseveral S. pneumoniae clinical isolates, demonstrating conservation of aSrtB type sortase across these isolates.

Recombinantly Produced AI polypeptides

It is also an aspect of the invention to alter a non-AI polypeptide tobe expressed as an AI polypeptide. The non-AI polypeptide may begenetically manipulated to additionally contain AI polypeptidesequences, e.g., a sortase substrate, pilin, or E-box motif, which maycause expression of the non-AI polypeptide as an AI polypeptide.Alternatively the non-AI polypeptide may be genetically manipulated toreplace an amino acid sequence within the non-AI polypeptide for AIpolypeptide sequences, e.g., a sortase substrate, pilin, or E-box motif,which may cause expression of the non-AI polypeptide as an AIpolypeptide. Any number of amino acid residues may be added to thenon-AI polypeptide or may be replaced within the non-AI polypeptide tocause its expression as an AI polypeptide. At least 5, 6, 7, 8, 9, 10,15, 20, 25, 30, 35, 50, 75, 100, 150, 200, or 250 amino acid residuesmay be replaced or added to the non-AI polypeptide amino acid sequence.GBS 322 may be one such non-AI polypeptide that may be expressed as anAI polypeptide.

GBS Adhesin Island Sequences

The GBS AI polypeptides of the invention can, of course, be prepared byvarious means (e.g. recombinant expression, purification from GBS,chemical synthesis etc.) and in various forms (e.g. native, fusions,glycosylated, non-glycosylated etc.). They are preferably prepared insubstantially pure form (i.e. substantially free from otherstreptococcal or host cell proteins) or substantially isolated form.

The GBS AI proteins of the invention may include polypeptide sequenceshaving sequence identity to the identified GBS proteins. The degree ofsequence identity may vary depending on the amino acid sequence (a) inquestion, but is preferably greater than 50% (e.g. 60%, 65%, 70%, 75%,80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% ormore). Polypeptides having sequence identity include homologs,orthologs, allelic variants and functional mutants of the identified GBSproteins. Typically, 50% identity or more between two proteins isconsidered to be an indication of functional equivalence. Identitybetween proteins is preferably determined by the Smith-Waterman homologysearch algorithm as implemented in the MPSRCH program (OxfordMolecular), using an affinity gap search with parameters gap openpenalty=]12 and gap extension penalty=1.

The GBS adhesin island polynucleotide sequences may includepolynucleotide sequences having sequence identity to the identified GBSadhesin island polynucleotide sequences. The degree of sequence identitymay vary depending on the polynucleotide sequence in question, but ispreferably greater than 50% (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%,91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or more).

The GBS adhesin island polynucleotide sequences of the invention mayinclude polynucleotide fragments of the identified adhesin islandsequences. The length of the fragment may vary depending on thepolynucleotide sequence of the specific adhesin island sequence, but thefragment is preferably at least 10 consecutive polynucleotides, (e.g. atleast 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100,150, 200 or more).

The GBS adhesin island amino acid sequences of the invention may includepolypeptide fragments of the identified GBS proteins. The length of thefragment may vary depending on the amino acid sequence of the specificGBS antigen, but the fragment is preferably at least 7 consecutive aminoacids, (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80,90, 100, 150, 200 or more). Preferably the fragment comprises one ormore epitopes from the sequence. Other preferred fragments include (1)the N-terminal signal peptides of each identified GBS protein, (2) theidentified GBS protein without their N-terminal signal peptides, and (3)each identified GBS protein wherein up to 10 amino acid residues (e.g.1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) are deleted from theN-terminus and/or the C-terminus e.g. the N-terminal amino acid residuemay be deleted. Other fragments omit one or more domains of the protein(e.g. omission of a signal peptide, of a cytoplasmic domain, of atransmembrane domain, or of an extracellular domain).

GBS 80

Examples of preferred GBS 80 fragments are discussed below.Polynucleotide and polypeptide sequences of GBS 80 from a variety of GBSserotypes and strain isolates are set forth in FIGS. 18 and 22. Thepolynucleotide and polypeptide sequences for GBS 80 from GBS serotype V,strain isolate 2603 are also included below as SEQ ID NOS 1 and 2: SEQID NO.1 ATGAAATTATCGAAGAAGTTATTGTTTTCGGCTGCTGTTTTAACAATGGTGGCGGGGTCAACTGTTGAACCAGTAGCTCAGTTTGCGACTGGAATGAGTATTGTAAGAGCTGCAGAAGTGTCACAAGAACGCCCAGCGAAAACAACAGTAAATATCTATAAATTACAAGCTGATAGTTATAAATCGGAAATTACTTCTAATGGTGGTATCGAGAATAAAGACGGCGAAGTAATATCTAACTATGCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAAACGTTATAAAGTCAAGACGGATATTTCTGTTGATGAATTGAAAAAATTGACAACAGTTGAAGCAGCAGATGCAAAAGTTGGAACGATTCTTGAAGAAGGTGTCAGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATTCAAAAAGTAATGTGAGATACTTGTATGTAGAAGATTTAAAGAATTCACCTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCAGTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTACCCTAAAAACGTTGTAACTGATGAACCAAAAACAGATAAAGATGTTAAAAAATTAGGTCAGGACGATGCAGGTTATACGATTGGTGAAGAATTCAAATGGTTCTTGAAATCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGAAATTACTGATAAATTTGCAGATGGCTTGACTTATAAATCTGTTGGAAAAATCAAGATTGGTTCGAAAACACTGAATAGAGATGAGCACTACACTATTGATGAACCAACAGTTGATAACCAAAATACATTAAAAATTACGTTTAAACCAGAGAAATTTAAAGAAATTGCTGAGCTACTTAAAGGAATGACCCTTGTTAAAAATCAAGATGCTCTTGATAAAGCTACTGCAAATACAGATGATGCGGCATTTTTGGAAATTCCAGTTGCATCAACTATTAATGAAAAAGCAGTTTTAGGAAAAGCAATTGAAAATACTTTTGAACTTCAATATGACCATACTCCTGATAAAGCTGACAATCCAAAACCATCTAATCCTCCAAGAAAACCAGAAGTTCATACTGGTGGGAAACGATTTGTAAAGAAAGACTCAACAGAAACACAAACACTAGGTGGTGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAGATGCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAAGCTGTTACTGGGCAACCAATCAAATTGAAATCACATACAGACGGTACGTTTGAGATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAGTAACTTACAAATTAAAAGAAACAAAAGCACCAGAAGGTTATGTAATCCCTGATAAAGAAATCGAGTTTACAGTATCACAAACATCTTATAATACAAAACCAACTGACATCACGGTTGATAGTGCTGATGCAACACCTGATACAATTAAAAACAACAAACGTCCTTCAATCCCTAATACTGGTGGTATTGGTACGGCTATCTTTGTCGCTATCGGTGCTGCGGTGATGGCTTTTGCTGTTAAGGGGATGAAGCGTCGT ACAAAAGATAAC SEQ IDNO: 2 MKLSKKLLFSAAVLTMVAGSTVEPVAQFATGMSIVRAAEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVASTINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPSIPNTGGIGTAIFVAIGAAVMAFAVKGMKRR TKDN

As described above, the compositions of the invention may includefragments of AI proteins. In some instances, removal of one or moredomains, such as a leader or signal sequence region, a transmembraneregion, a cytoplasmic region or a cell wall anchoring motif, mayfacilitate cloning of the gene encoding the protein and/or recombinantexpression of the GBS AI protein. In addition, fragments comprisingimmunogenic epitopes of the cited GBS AI proteins may be used in thecompositions of the invention.

For example, GBS 80 contains an N-terminal leader or signal sequenceregion which is indicated by the underlined sequence at the beginning ofSEQ ID NO: 2 above. In one embodiment, one or more amino acids from theleader or signal sequence region of GBS 80 are removed. An example ofsuch a GBS 80 fragment is set forth below as SEQ ID NO: 3: SEQ ID NO: 3AEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVASTINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPSIPNTGGIGTAIFVAIGA AVMAFAVKGMKRRTKDN

GBS 80 contains a C-terminal transmembrane region which is indicated bythe underlined sequence near the end of SEQ ID NO: 2 above. In oneembodiment, one or more amino acids from the transmembrane region and/ora cytoplasmic region are removed. An example of such a GBS 80 fragmentis set forth below as SEQ ID NO: 4: SEQ ID NO: 4MKLSKKLLFSAAVLTMVAGSTVEPVAQFATGMSIVRAAEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVASTINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPSIPNTG

GBS 80 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 5 IPNTG (shown in italics in SEQ ID NO: 2 above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant GBS 80 protein from the hostcell. Accordingly, in one preferred fragment of GBS 80 for use in theinvention, the transmembrane and/or cytoplasmic regions and the cellwall anchor motif are removed from GBS 80. An example of such a GBS 80fragment is set forth below as SEQ ID NO: 6. SEQ ID NO: 6MKLSKKLLFSAAVLTMVAGSTVEPVAQFATGMSIVRAAEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQDDAGYTTGEEFKWFLKSTTPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKTTFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAELEIPVASTLNEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTD ITVDSADATPDTIKNNKRPS

Alternatively, in some recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

In one embodiment, the leader or signal sequence region, thetransmembrane and cytoplasmic regions and the cell wall anchor motif areremoved from the GBS 80 sequence. An example of such a GBS 80 fragmentis set forth below as SEQ ID NO: 7. SEQ ID NO: 7AEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVASTINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPS

Applicants have identified a particularly immunogenic fragment of theGBS 80 protein. This immunogenic fragment is located towards theN-terminus of the protein and is underlined in the GBS 80 SEQ ID NO: 2sequence below. The underlined fragment is set forth below as SEQ ID NO:8. SEQ ID NO: 2 MKLSKKLLFSAAVLTMVAGSTVEPVAQFATGMSIVRAAEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKTTFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVASTINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPSIPNTGGIGTAIFVAIGAAVMAFAVKGMKRR TKDN SEQ ID NO: 8AEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQ NTLKITFKPEKFKEIAELLKG

The immunogenicity of the protein encoded by SEQ ID NO: 7 was comparedagainst PBS, GBS whole cell, GBS 80 (full length) and another fragmentof GBS 80, located closer to the C-terminus of the peptide (SEQ ID NO:9, below). SEQ ID NO: 9MTLVKNQDALDKATANTDDAAFLEIPVASTINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADA TPDTIKNNKRPS

Both an Active Maternal Immunization Assay and a Passive MaternalImmunization Assay were conducted on this collection of proteins.

As used herein, an Active Maternal Immunization assay refers to an invivo protection assay where female mice are immunized with the testantigen composition. The female mice are then bred and their pups arechallenged with a lethal dose of GBS. Serum titers of the female miceduring the immunization schedule are measured as well as the survivaltime of the pups after challenge.

Specifically, the Active Maternal Immunization assays referred to hereinused groups of four CD-1 female mice (Charles River Laboratories, CalcoItaly). These mice were immunized intraperitoneally with the selectedproteins in Freund's adjuvant at days 1, 21 and 35, prior to breeding.6-8 weeks old mice received 20 μg protein/dose when immunized with asingle antigen, 30-45 μg protein/dose (15 μg each antigen) whenimmunized with combination of antigens. The immune response of the damswas monitored by using serum samples taken on day 0 and 49. The femalemice were bred 2-7 days after the last immunization (at approximatelyt=36-37), and typically had a gestation period of 21 days. Within 48hours of birth, the pups were challenged via I.P. with GBS in a doseapproximately equal to a amount which would be sufficient to kill 70-90%of unimmunized pups (as determined by empirical data gathered from PBScontrol groups). The GBS challenge dose is preferably administered in 50μl of THB medium. Preferably, the pup challenge takes place at 56 to 61days after the first immunization. The challenge inocula were preparedstarting from frozen cultures diluted to the appropriate concentrationwith THB prior to use. Survival of pups was monitored for 5 days afterchallenge.

As used herein, the Passive Maternal Immunization Assay refers to an invivo protection assay where pregnant mice are passively immunized byinjecting rabbit immune sera (or control sera) approximately 2 daysbefore delivery. The pups are then challenged with a lethal dose of GBS.

Specifically, the Passive Maternal Immunization Assay referred to hereinused groups of pregnant CD1 mice which were passively immunized byinjecting 1 ml of rabbit immune sera or control sera via I.P., 2 daysbefore delivery. Newborn mice (24-48 hrs after birth) are challenged viaI.P. with a 70-90% lethal dose of GBS serotype II COH1. The challengedose, obtained by diluting a frozen mid log phase culture, wasadministered in 50 μl of THB medium.

For both assays, the number of pups surviving GBS infection was assessedevery 12 hrs for 4 days. Statistical significance was estimated byFisher's exact test.

The results of each assay for immunization with SEQ ID NO: 7, SEQ ID NO:8, PBS and GBS whole cell are set forth in Tables 1 and 2 below. TABLE 1Immunization Antigen Alive/total % Survival Fisher's exact test PBS (negcontrol) 13/80 16% GBS (whole cell) 54/65 83% P < 0.00000001 GBS80(intact) 62/70 88% P < 0.00000001 GBS80 (fragment) 35/64 55% P =0.0000013 SEQ ID 7 GBS80 (fragment) 13/67 19% P = 0.66 SEQ ID 8

TABLE 2 Passive Maternal Immunization Antigen Alive/total % SurvivalFisher's exact test PBS (neg control) 12/42 28% GBS (whole cell) 48/5292% P < 0.00000001 GBS80 (intact) 48/55 87% P < 0.00000001 GBS80(fragment) 45/57 79% P = 0.0000006 SEQ ID 7 GBS80 (fragment) 13/54 24% P= 1 SEQ ID 8

As shown in Tables 1 and 2, immunization with the SEQ ID NO: 7 GBS 80fragment provided a substantially improved survival rate for thechallenged pups than the comparison SEQ ID NO: 8 GBS 80 fragment. Theseresults indicate that the SEQ ID NO: 7 GBS 80 fragment may comprise animportant immunogenic epitope of GBS 80.

As discussed above, pilin motifs, containing conserved lysine (K)residues have been identified in GBS 80. The pilin motif sequences areunderlined in SEQ ID NO: 2, below. Conserved lysine (K) residues aremarked in bold, at amino acid residues 199 and 207 and at amino acidresidues 368 and 375. The pilin sequences, in particular the conservedlysine residues, are thought to be important for the formation ofoligomeric, pilus-like structures of GBS 80. Preferred fragments of GBS80 include at least one conserved lysine residue. Preferably, fragmentsinclude at least one pilin sequence. SEQ ID NO: 2MKLSKKLLFSAAVLTMVAGSTVEPVAQFATGMSIVRAAEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKN VVTDEP KTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVASTINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRK PEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPSIPNTGGIGTAIFVAIGAAVMAFAVKGMKRR TKDN

E boxes containing conserved glutamic residues have also been identifiedin GBS 80. The E box motifs are underlined in SEQ ID NO: 2 below. Theconserved glutamic acid (E) residues, at amino acid residues 392 and471, are marked in bold. The E box motifs, in particular the conservedglutamic acid residues, are thought to be important for the formation ofoligomeric pilus-like structures of GBS 80. Preferred fragments of GBS80 include at least one conserved glutamic acid residue. Preferably,fragments include at least one E box motif. SEQ ID NO: 2MKLSKKLLFSAAVLTMVAGSTVEPVAQFATGMSIVRAAEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQDDAGYTIGEEFKWFLKSTTPANLGDYEKEEITDKFADGLTYKSVGKIKIGSKTLNRDLHYTIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVASTINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPSIPNTGGIGTAIFVAIGAAVMAFAVKGMKRR TKDN

Similarly, the following offers examples of preferred GBS 104 fragments.Nucleotide and amino acid sequences of GBS 104 sequenced from serotype Visolated strain 2603 are set forth below as SEQ ID NOS 10 and 11: SEQ IDNO. 10 ATGAAAAAGAGACAAAAAATATGGAGAGGGTTATCAGTTACTTTACTAATCCTGTCCCAAATTCCATTTGGTATATTGGTACAAGGTGAAACCCAAGATACCAATCAAGCACTTGGAAAAGTAATTGTTAAAAAAACGGGAGACAATGCTACACCATTAGGCAAAGCGACTTTTGTGTTAAAAAATGACAATGATAAGTCAGAAACAAGTCACGAAACGGTAGAGGGTTCTGGAGAAGCAACCTTTGAAAACATAAAACCTGGAGACTACACATTAAGAGAAGAAACAGCACCAATTGGTTATAAAAAAACTGATAAAACCTGGAAAGTTAAAGTTGCAGATAACGGAGCAACAATAATCGAGGGTATGGATGCAGATAAAGCAGAGAAACGAAAAGAAGTTTTGAATGCCCAATATCCAAAATCAGCTATTTATGAGGATACAAAAGAAAATTACCCATTAGTTAATGTAGAGGGTTCCAAAGTTGGTGAACAATACAAAGCATTGAATCCAATAAATGGAAAAGATGGTCGAAGAGAGATTGCTGAAGGTTGGTTATCAAAAAAAATTACAGGGGTCAATGATCTCGATAAGAATAAATATAAAATTGAATTAACTGTTGAGGGTAAAACCACTGTTGAAACGAAAGAACTTAATCAACCACTAGATGTCGTTGTGCTATTAGATAATTCAAATAGTATGAATAATGAAAGAGCCAATATATTCTCAAAGAGCATTAAAAGCTGGGAAGCAGTTGAAAAGCTGATTGATAAAATTACATCAAATAAAGACAATAGAGTAGCTCTTGTGACATATGCCTCAACCATTTTTGATGGTACTGAAGCGACCGTATCAAAGGGAGTTGCCGATCAAAATGGTAAAGCGCTGAATGATAGTGTATCATGGGATTATCATAAAACTACTTTTACAGCAACTACACATAATTACAGTTATTTAAATTTAACAAATGATGCTAACGAAGTTAATATTCTAAAGTCAAGAATTCCAAAGGAAGCGGAGCATATAAATGGGGATCGCACGCTCTATCAATTTGGTGCGACATTTACTCAAAAAGCTCTAATGAAAGCAAATGAAATTTTAGAGACACAAAGTTCTAATGCTAGAAAAAAACTTATTTTTCACGTAACTGATGGTGTCCCTACGATGTCTTATGCCATAAATTTTAATCCTTATATATCAACATCTTACCAAAACCAGTTTAATTCTTTTTTAAATAAAATACCAGATAGAAGTGGTATTCTCCAAGAGGATTTTATAATCAATGGTGATGATTATCAAATAGTAAAAGGAGATGGAGAGAGTTTTAAACTGTTTTCGGATAGAAAAGTTCCTGTTACTGGAGGAACGACACAAGCAGCTTATCGAGTACCGCAAAATCAACTCTCTGTAATGAGTAATGAGGGATATGCAATTAATAGTGGATATATTTATCTCTATTGGAGAGATTACAACTGGGTCTATCCATTTGATCCTAAGACAAAGAAAGTTTCTGCAACGAAACAAATCAAAACTCATGGTGAGCCAACAACATTATACTTTAATGGAAATATAAGACCTAAAGGTTATGACATTTTTACTGTTGGGATTGGTGTAAACGGAGATCCTGGTGCAACTCCTCTTGAAGCTGAGAAATTTATGCAATCAATATCAAGTAAAACAGAAAATTATACTAATGTTGATGATACAAATAAAATTTATGATGAGCTAAATAAATACTTTAAAACAATTGTTGAGGAAAAACATTCTATTGTTGATGGAAATGTGACTGATCCTATGGGAGAGATGATTGAATTCCAATTAAAAAATGGTCAAAGTTTTACACATGATGATTACGTTTTGGTTGGAAATGATGGCAGTCAATTAAAAAATGGTGTGGCTCTTGGTGGACCAAACAGTGATGGGGGAATTTTAAAAGATGTTACAGTGACTTATGATAAGACATCTCAAACCATCAAAATCAATCATTTGAACTTAGGAAGTGGACAAAAAGTAGTTCTTACCTATGATGTACGTTTAAAAGATAACTATATAAGTAACAAATTTTACAATACAAATAATCGTACAACGCTAAGTCCGAAGAGTGAAAAAGAACCAAATACTATTCGTGATTTCCCAATTCCCAAAATTCGTGATGTTCGTGAGTTTCCGGTACTAACCATCAGTAATCAGAAGAAAATGGGTGAGGTTGAATTTATTAAAGTTAATAAAGACAAACATTCAGAATCGCTTTTGGGAGCTAAGTTTCAACTTCAGATAGAAAAAGATTTTTCTGGGTATAAGCAATTTGTTCCAGAGGGAAGTGATGTTACAACAAAGAATGATGGTAAAATTTATTTTAAAGCACTTCAAGATGGTAACTATAAATTATATGAAATTTCAAGTCCAGATGGCTATATAGAGGTTAAAACGAAACCTGTTGTGACATTTACAATTCAAAATGGAGAAGTTACGAACCTGAAAGCAGATCCAAATGCTAATAAAAATCAAATCGGGTATCTTGAAGGAAATGGTAAACATCTTATTACCAACACTCCCAAACGCCCACCAGGTGTTTTTCCTAAAACAGGGGGAATTGGTACAATTGTCTATATATTAGTTGGTTCTACTTTTATGATACTTACCATTTGTT CTTTCCGTCGTAAACAATTGSEQ ID NO. 11 MKKRQKIWRGLSVTLLILSQIPFGILVQGETQDTNQALGKVIVKKTGDNATPLGKATFVLKNDNDKSETSHETVEGSGEATFENIKPGDYTLREETAPIGYKKTDKTWKVKVADNGATIIEGMDADKAEKRKEVLNAQYPKSAIYEDTKENYPLVNVEGSKVGEQYKALNPINGKDGRREIAEGWLSKKTTGVNDLDKNKYKIELTVEGKTTVETKELNQPLDVVVLLDNSNSMNNERANNSQRALKAGEAVEKLIDKTTSNKDNRVALVTYASTIFDGTEATVSKGVADQNGKALNDSVSWDYHKTTFTATTHNYSYLNLTNDANEVNILKSRIPKEAEHINGDRTLYQFGATFTQKALMKANEILETQSSNARKKLIFHVTDGVPTMSYAINFNPYTSTSYQNQFNSFLNKIPDRSGILQEDFIINGDDYQIVKGDGESFKLFSDRKVPVTGGTTQAAYRVPQNQLSVMSNEGYAINSGYIYLYWRDYNWVYPFDPKTKKVSATKQIKTHGEPTTLYFNGNIRPKGYDIFTVGIGVNGDPGATPLEAEKFMQSISSKTENYTNVDDTNKIYDELNKYFKTIVEEKHSIVDGNVTDPMGEMIEFQLKNGQSFTHDDYVLVGNDGSQLKNGVALGGPNSDGGILKDVTVTYDKTSQTIKINHLNLGSGQKVVLTYDVRLKDNYISNKEYNTNNRTTLSPKSEKEPNTIRDFPIPKIRDVREFPVLTISNQKKMGEVEFIKVNKDKHSESLLGAKFQLQIEKDFSGYKQFVPEGSDVTTKNDGKIYFKALQDGNYKLYEISSPDGYIEVKTKPVVTFTIQNGEVTNLKADPNANKNQIGYLEGNGKHLITNTPKRPPGVFPKTGGIGTIVYILVGSTFMILTICSFRRKQL

GBS 104 contains an N-terminal leader or signal sequence region which isindicated by the underlined sequence at the beginning of SEQ ID NO 11above. In one embodiment, one or more amino acid sequences from theleader or signal sequence region of GBS 104 are removed. An example ofsuch a GBS 104 fragment is set forth below as SEQ ID NO 12. SEQ ID NO:12 GETQDTNQALGKVIVKKTGDNATPLGKATFVLKNDNDKSETSHETVEGSGEATFENIKPGDYTLREETAPIGYKKTDKTWKVKVADNGATIIEGMDADKAEKRKEVLNAQYPKSAIYEDTKENYPLVNVEGSKVGEQYKALNPINGKDGRREIAEGWLSKKITGVNDLDKNKYKIELTVEGKTTVETKELNQPLDVVVLLDNSNSMNNERANNSQPALKAGEAVEKLIDKITSNKDNRVALVTYASTIFDGTEATVSKGVADQNGKALNDSVSWDYHKTTFTATTHNYSYLNLTNDANEVNILKSRIPKEAEHINGDRTLYQFGATFTQKALMIANEILETQSSNARKKLIFHVTDGVPTMSYAINFNPYISTSYQNQFNSFLNKIPDRSGILQEDFIINGDDYQIVKGDGESFKLFSDRKVPVTGGTTQAAYRVPQNQLSVMSNEGYATNSGYIYLYWRDYNWVYPFDPKTKKVSATKQIKTHGEPTTLYPNGNTRPKGYDIFTVGIGVNGDPGATPLEAEKFMQSISSKTENYTNVDDTNKIYDELNKYEKTIVEEKHSIVDGNVTDPMGEMIEFQLKNGQSPTHDDYVLVGNDGSQLKNGVALGGPNSDGGILKDVTVTYDKTSQTIKINHLNLGSGQKVVLTYDVRLKDNYISNKFYNTNNRTTLSPKSEKEPNTIRDFPIPKIRDVREFPVLTISNQKKMGEVEFIKVNKDKHSESLLGAKFQLQIEKDESGYKQFVPEGSDVTTKNDGKIYFKALQDGNYKLYEISSPDGYIEVKTKPVVTFTIQNGEVTNLKADPNANKNQIGYLEGNGKHLITNTPKRPPGVFPKTGGIGTIVYILVGSTFM ILTICSFRRKQL

GBS 104 contains a C-terminal transmembrane and/or cytoplasmic regionwhich is indicated by the underlined region near the end of SEQ ID NO 11above. In one embodiment, one or more amino acids from the transmembraneor cytomplasmic regions are removed. An example of such a GBS 104fragment is set forth below as SEQ ID NO 13. SEQ ID NO: 13MKKRQKIWRGLSVTLLILSQIPFGILVQGETQDTNQALGKVIVKKTGDNATPLGKATFVLKNDNDKSETSHETVEGSGEATFENIKPGDYTLREETAPIGYKKTDKTWKVKVADNGATIIEGMDADKAEKRKEVLNAQYPKSAIYEDTKENYPLVNVEGSKVGEQYKALNPINGKDGRREIAEGWLSKKITGVNDLDKNKYKIELTVEGKTTVETKELNQPLDVVVLLDNSNSMNNERANNSQRALKAGEAVEKLIDKITSNKDNRVALVTYASTIFDGTEATVSKGVADQNGKALNDSVSWDYHKTTFTATTHNYSYLNLTNDANEVNILKSRIPKEAEHINGDRTLYQFGATFTQKALMKANEILETQSSNARKKLIFHVTDGVPTMSYAINFNPYISTSYQNQFNSFLNKIPDRSGTLQEDFIINGDDYQIVKGDGESFKLFSDRKVPVTGGTTQAAYRVPQNQLSVMSNEGYAINSGYIYLYWRDYNWVYPFDPKTKKVSATKQIKTHGEPTTLYFNGNTRPKGYDIFTVGIGVNGDPGATPLEAEKFMQSISSKTENYTNVDDTNKIYDELNKYFKTIVEEKHSIVDGNVTDPMGEMIEFQLKNGQSFTHDDYVLVGNDGSQLKNGVALGGPNSDGGILKDVTVTYDKTSQTIKINHLNLGSGQKVVLTYDVRLKDNYISNKFYNTNNRTTLSPKSEKEPNTIRDFPIPKIRDVREFPVLTISNQKKMGEVEFIKVNKDKHSESLLGAKFQLQIEKDFSGYKQFVPEGSDVTTKNDGKTYFKALQDGNYKLYEISSPDGYIEVKTKPVVTFTIQNGEVTNLKADPNANKNQIGYLEGNGKHLITN T

In one embodiment, one or more amino acids from the leader or signalsequence region and one or more amino acids from the transmembrane orcytoplasmic regions are removed. An example of such a GBS 104 fragmentis set forth below as SEQ ID NO 14. SEQ ID NO: 14GETQDTNQALGKVIVKKTGDNATPLGKATFVLKNDNDKSETSHETVEGSGEATFENIKPGDYTLREETAPIGYKKTDKTWKVKVADNGATIIEGMDADKAEKRKEVLNAQYPKSAIYEDTKENYPLVNVEGSKVGEQYKALNPINGKDGRREIAEGWLSKKITGVNDLDKNKYKIELTVEGKTTVETKELNQPLDVVVLLDNSNSMNNERANNSQRALKAGEAVEKLIDKITSNKDNRVALVTYASTIFDGTEATVSKGVADQNGKALNDSVSWDYHKTTFTATTHNYSYLNLTNDANEVNILKSRIPKEAEHINGDRTLYQFGATFTQKALMKANEILETQSSNARKKLIFHVTDGVPTMSYAINFNPYISTSYQNQFNSFLNKIPDRSGILQEDFIINGDDYQIVKGDGESFKLFSDRKVPVTGGTTQAAYRVPQNQLSVMSNEGYAINSGYIYLYWRDYNWVYPFDPKTKKVSATKQIKTHGEPTTLYFNGNIRPKGYDIFTVGIGVNGDPGATPLEAEKFMQSISSKTENYTNVDDTNKIYDELNKYFKTIVEEKHSIVDGNVTDPMGEMTEFQLKNGQSFTHDDYVLVGNDGSQLKNGVALGGPNSDGGILKDVTVTYDKTSQTIKINHLNLGSGQKVVLTYDVRLKDNYISNKFYNTNNRTTLSPKSEKEPNTIRDFPIPKIRDVREFPVLTISNQKKMGEVEFIKVNKDKHSESLLGAKFQLQIEKDFSGYKQFVPEGSDVTTKNDGKIYEKALQDGNYKLYEISSPDGYIEVKTKPVVTFTTQNGEVTNLKADPNANKNQIGYLEGNGKHLITNT

GBS 104, like GBS 80, contains an amino acid motif indicative of a cellwall anchor: SEQ ID NO: 123 FPKTG (shown in italics in SEQ ID NO: 11above). In some recombinant host cell systems, it may be preferable toremove this motif to facilitate secretion of a recombinant GBS 104protein from the host cell. Accordingly, in one preferred fragment ofGBS 104 for use in the invention, only the transmembrane and/orcytoplasmic regions and the cell wall anchor motif are removed from GBS104. Alternatively, in some recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

Two pilin motifs, containing conserved lysine (K) residues, have beenidentified in GBS 104. The pilin motif sequences are underlined in SEQID NO: 11, below. Conserved lysine (K) residues are marked in bold, atamino acid residues 141 and 149 and at amino acid residues 499 and 507.The pilin sequence, in particular the conserved lysine residues, arethought to be important for the formation of oligomeric, pilus-likestructures of GBS 104. Preferred fragments of GBS 104 include at leastone conserved lysine residue. Preferably, fragments include at least onepilin sequence. SEQ ID NO. 11MKKRQKIWRGLSVTLLILSQIPFGILVQGETQDTNQALGKVIVKKTGDNATPLGKATFVLKNDNDKSETSHETVEGSGEATFENIKPGDYTLREETAPIGYKKTDKTWKVKVADNGATIIEGMDADKAEKRKEVLNAQYPKSAIYEDTK ENYPLVNVEGSKVGEQYKALNPINGKDGRREIAEGWLSKKITGVNDLDKNKYKIELTVEGKTTVETKELNQPLDVVVLLDNSNSMNNERANNSQRALKAGEAVEKLIDKITSNKDNRVALVTYASTIFDGTEATVSKGVADQNGKALNDSVSWDYHKTTFTATTHNYSYLNLTNDANEVNILKSRIPKEAEHINGDRTLYQFGATFTQKALMKANEILETQSSNARKKLIFHVTDGVPTMSYAINFNPYISTSYQNQFNSFLNKIPDRSGILQEDFIINGDDYQIVKGDGESEKLFSDRKVPVTGGTTQAAYRVPQNQLSVMSNEGYAINSGYIYLYWRDYNWVYPFDPKT KKVSATKQIKTHGEPTTLYFNGNIRPKGYDIFTVGIGVNGDPGATPLEAEKFMQSISSKTENYTNVDDTNKIYDELNKYFKTIVEEKHSIVDGNVTDPMGEMIEFQLKNGQSFTHDDYVLVGNDGSQLKNGVALGGPNSDGGILKDVTVTYDKTSQTTKINHLNLGSGQKVVLTYDVRLKDNYISNKFYNTNNRTTLSPKSEKEPNTIRDFPIPKIRDVREFPVLTISNQKKMGEVEFIKVNKDKHSESLLGAKFQLQIEKDFSGYKQFVPEGSDVTTKNDGKTYFKALQDGNYKLYEISSPDGYIEVKTKPVVTFTIQNGEVTNLKADPNANKNQIGYLEGNGKHLITNTPKRPPGVFPKTGGIGTIVYILVGSTFMILLTICSFRRKQL

Two E boxes containing a conserved glutamic residues have also beenidentified in GBS 104. The E box motifs are underlined in SEQ ID NO: 11below. The conserved glutamic acid (E) residues, at amino acid residues94 and 798, are marked in bold. The E box motifs, in particular theconserved glutamic acid residues, are thought to be important for theformation of oligomeric pilus-like structures of GBS 104. Preferredfragments of GBS 104 include at least one conserved glutamic acidresidue. Preferably, fragments include at least one E box motif. SEQ IDNO. 11 MKKRQKIWRGLSVTLLILSQIPFGILVQGETQDTNQALGKVIVKKTGDNATPLGKATFVLKNDNDKSETSHETVEGSGEATFENIKPGDYTLREETAPIGYKKTDKTWKVKVADNGATIIEGMDADKAEKRKEVLNAQYPKSAIYEDTKENYPLVNVEGSKVGEQYKALNPINGKDGRREIAEGWLSKKITGVNDLDKNKYKIELTVEGKTTVETKELNQPLDVVVLLDNSNSMNNERANNSQRALKAGEAVEKLIDKITSNKDNRVALVTYASTIFDGTEATVSKGVADQNGKALNDSVSWDYHKTTFTATTHNYSYLNLTNDANEVNILKSRIPKEAEHINGDRTLYQFGATFTQKALMKANEILETQSSNARKKLIFHVTDGVPTMSYAINFNPYISTSYQNQFNSFLNKIPDRSGILQEDFIINGDDYQIVKGDGESEKLFSDRKVPVTGGTTQAAYRVPQNQLSVMSNEGYAINSGYIYLYWRDYNWVYPFDPKTKKVSATKQIKTHGEPTTLYFNGNIRPKGYDIFTVGIGVNGDPGATPLEAEKFMQSISSKTENYTNVDDTNKIYDELNKYFKTIVEEKHSIVDGNVTDPMGEMIEFQLKNGQSFTHDDYVLVGNDGSQLKNGVALGGPNSDGGILKDVTVTYDKTSQTTKINHLNLGSGQKVVLTYDVRLKDNYISNKFYNTNNRTTLSPKSEKEPNTIRDFPIPKIRDVREFPVLTISNQKKMGEVEFIKVNKDKHSESLLGAKFQLQIEKDFSGYKQFVPEGSDVTTKNDGKTYFKALQDGNYKLYEISSPDGYIEVKTKPVVTFTIQNGEVTNLKADPNANKNQIGYLEGNGKHLITNTPKRPPGVFPKTGGIGTIVYILVGSTFMILLTICSFRRKQLGBS 067

The following offers examples of preferred GBS 067 fragments. Nucleotideand amino acid sequence of GBS 067 sequences from serotype V isolatedstrain 2603 are set forth below as SEQ ID NOS: 15 and 16. SEQ ID NO: 15ATGAGAAAATACCAAAAATTTTCTAAAATATTGACGTTAAGTCTTTTTTGTTTGTCGCAAATACCGCTTAATACCAATGTTTTAGGGGAAAGTACCGTACCGGAAAATGGTGCTAAAGGAAAGTTAGTTGTTAAAAAGACAGATGACCAGAACAAACCACTTTCAAAAGCTACCTTTGTTTTAAAAACTACTGCTCATCCAGAAAGTAAAATAGAAAAAGTAACTGCTGAGCTAACAGGTGAAGCTACTTTTGATAATCTCATACCTGGAGATTATACTTTATCAGAAGAAACAGCGCCCGAAGGTTATAAAAAGACTAACCAGACTTGGCAAGTTAAGGTTGAGAGTAATGGAAAAACTACGATACAAAATAGTGGTGATAAAAATTCCACAATTGGACAAAATCAGGAAGAACTAGATAAGCAGTATCCCCCCACAGGAATTTATGAAGATACAAAGGAATCTTATAAACTTGAGCATGTTAAAGGTTCAGTTCCAAATGGAAAGTCAGAGGCAAAAGCAGTTAACCCATATTCAAGTGAAGGTGAGCATATAAGAGAAATTCCAGAGGGAACATTATCTAAACGTATTTCAGAAGTAGGTGATTTAGCTCATAATAAATATAAAATTGAGTTAACTGTCAGTGGAAAAACCATAGTAAAACCAGTGGACAAACAAAAGCCGTTAGATGTTGTCTTCGTACTCGATAATTCTAACTCAATGAATAACGATGGCCCAAATTTTCAAAGGCATAATAAAGCCAAGAAAGCTGCCGAAGCTCTTGGGACCGCAGTAAAAGATATTTTAGGAGCAAACAGTGATAATAGGGTTGCATTAGTTACCTATGGTTCAGATATTTTTGATGGTAGGAGTGTAGATGTCGTAAAAGGATTTAAAGAAGATGATAAATATTATGGCCTTCAAACTAAGTTCACAATTCAGACAGAGAATTATAGTCATAAACAATTAACAAATAATGCTGAAGAGATTATAAAAAGGATTCCGACAGAAGCTCCTAAAGCTAAGTGGGGATCTACTACCAATGGATTAACTCCAGAGCAACAAAAGGAGTACTATCTTAGTAAAGTAGGAGAAACATTTACTATGAAAGCCTTCATGGAGGCAGATGATATTTTGAGTCAAGTAAATCGAAATAGTCAAAAAATTATTGTTCATGTAACTGATGGTGTTCCTACGAGATCATATGCTATTAATAATTTTAAACTGGGTGCATCATATGAAAGCCAATTTGAACAAATGAAAAAAAATGGATATCTAAATAAAAGTAATTTTCTACTTACTGATAAGCCCGAGGATATAAAAGGAAATGGGGAGAGTTACTTTTTGTTTCCCTTAGATAGTTATCAAACACAGATAATCTCTGGAAACTTACAAAAACTTCATTATTTAGATTTAAATCTTAATTACCCTAAAGGTACAATTTATCGAAATGGACCAGTGAAAGAACATGGAACACCAACCAAACTTTATATAAATAGTTTAAAACAGAAAAATTATGACATTTTTAATTTTGGTATCGATATATCTGGTTTTAGACAAGTTTATAATGAGGAGTATAAGAAAAATCAAGATGGTACTTTTCAAAAATTGAAAGAGGAAGCTTTTAAACTTTCAGATGGAGAAATCACAGAACTAATGAGGTCGTTCTCTTCCAAACCTGAGTACTACACCCCTATCGTAACTTCAGCCGATACATCTAACAATGAAATTTTATCTAAAATTCAGCAACAATTTGAAACGATTTTAACAAAAGAAAACTCAATTGTTAATGGAACTATCGAAGATCCTATGGGTGATAAAATCAATTTACAGCTTGGTAATGGACAAACATTACAGCCAAGTGATTATACTTTACAGGGAAATGATGGAAGTGTAATGAAGGATGGTATTGCAACTGGTGGGCCTAATAATGATGGTGGAATACTTAAGGGGGTTAAATTAGAATACATCGGAAATAAACTCTATGTTAGAGGTTTGAATTTAGGAGAAGGTCAAAAAGTAACACTCACATATGATGTGAAACTAGATGACAGTTTTATAAGTAACAAATTCTATGACACTAATGGTAGAACAACATTGAATCCTAAGTCAGAGGATCCTAATACACTTAGAGATTTTCCAATCCCTAAAATTCGTGATGTGAGAGAATATCCTACAATAACGATTAAAAACGAGAAGAAGTTAGGTGAAATTGAATTTATAAAAGTTGATAAAGATAATAATAAGTTGCTTCTCAAAGGAGCTACGTTTGAACTTCAAGAATTTAATGAAGATTATAAACTTTATTTACCAATAAAAAATAATAATTCAAAAGTAGTGACGGGAGAAAACGGCAAAATTTCTTACAAAGATTTGAAAGATGGCAAATATCAGTTAATAGAAGCAGTTTCGCCGGAGGATTATCAAAAAATTACTAATAAACCAATTTTAACTTTTGAAGTGGTTAAAGGATCGATAAAAAATATAATAGCTGTTAATAAACAGATTTCTGAATATCATGAGGAAGGTGACAAGCATTTAATTACCAACACGCATATTCCACCAAAAGGAATTATTCCTATGACAGGTGGGAAAGGAATTCTATCTTTCATTTTAATAGGTGGAGCTATGATGTCTATTGCAGGTGGAATTTATATTTGGAAAAGGTATAAGAAATCTAGTGATATGTCCATCAAAAAA GAT SEQ ID NO: 16MRKYQKFSKILTLSLFCLSQIPLNTNVLGESTVPENGAKGKLVVKKTDDQNKPLSKATFVLKTTAHPESKIEKVTAELTGEATFDNLIPGDYTLSEETAPEGYKKTNQTWQVKVESNGKTTIQNSGDKNSTIGQNQEELDKQYPPTGIYEDTKESYKLEHVKGSVPNGKSEAKAVNPYSSEGEHIREIPEGTLSKRISEVGDLAHNKYKIELTVSGKTIVKPVDKQKPLDVVFVLDNSNSMNNDGPNFQRHNKAKKAAEALGTAVKDILGANSDNRVALVTYGSDIFDGRSVDVVKGFKEDDKYYGLQTKFTIQTENYSHKQLTNNAEEIIKRIPTEAPKAKWGSTTNGLTPEQQKEYYLSKVGETFTMKAFMEADDILSQVNRNSQKIIVHVTDGVPTRSYAINNFKLGASYESQFEQMKKNGYLNKSNFLLTDKPEDIKGNGESYFLFPLDSYQTQIISGNLQKLHYLDLNLNYPKGTIYRNGPVKEHGTPTKLYINSLKQKNYDIFNFGIDISGFRQVYNEEYKKNQDGTFQKLKEEAFKLSDGEITELMRSFSSKPEYYTPIVTSADTSNNEILSKIQQQFETILTKENSIVNGTIEDPMGDKTNLQLGNGQTLQPSDYTLQGNDGSVMKDGIATGGPNNDGGILKGVKLEYIGNKLYVRGLNLGEGQKVTLTYDVKLDDSFTSNKFYDTNGRTTLNPKSEDPNTLRDFPIPKIRDVREYPTITIKNEKKLGEIEFIKVDKDNNKLLLKGATFELQEFNEDYRLYLPIKNNNSKVVTGENGKISYKDLKDGKYQLIEAVSPEDYQKITNKPILTFEVVKGSIKNIIAVNKQISEYHEEGDKHLITNTHIPPKGIIPMTGGKGILSFILIGGAMMSIAGGIYIWKRYKKSSDMSIKK D

GBS 067 contains a C-terminus transmembrane region which is indicated bythe underlined region closest to the C-terminus of SEQ ID NO: 16 above.In one embodiment, one or more amino acids from the transmembrane regionis removed and or the amino acid is truncated before the transmembraneregion. An example of such a GBS 067 fragment is set forth below as SEQID NO: 17. SEQ ID NO: 17MRKYQKFSKILTLSLFCLSQIPLNTNVLGESTVPENGAKGKLVVKKTDDQNKPLSKATFVLKTTAHPESKIEKVTAELTGEATPDNLIPGDYTLSEETAPEGYKKTNQTWQVKVESNGKTTIQNSGDKNSTIGQNQEELDKQYPPTGIYEDTKESYKLEHVKGSVPNGKSEAKAVNPYSSEGEHIRETPEGTLSKRISEVGDLAHNKYKTELTVSGKTIVKPVDKPKPLDVVFVLDNSNSMNNDGPNFQRHNKAKKAAEALGTAVKDILGANSDNRVALVTYGSDIFDGRSVDVVKGFKEDDKYYGLQTKETIQTENYSHKQLTNNAEEIIKRIPTEAPKAKWGSTTNGLTPEQQKEYYLSKVGETFTMKAFMEADDILSQVNRNSQKIIVHVTDGVPTRSYAINNFKLGASYESQFEQMKKNGYLNKSNFLLTDKPEDIKGNGESYFLFPLDSYQTQIISGNLQKLHYLDLNLNYPKGTIYRNGPVKEHGTPTKLYINSLKQKNYDIFNFGTDISGFRQVYNEEYKKNQDGTFQKLKEEAFKLSDGEITELMRSESSKPEYYTPIVTSADTSNNEILSKIQQQFETILTKENSIVNGTIEDPMGDKINLQLGNGQTLQPSDYTLQGNDGSVMKDGIATGGPNNDGGILKGVKLEYIGNKLYVRGLNLGEGQKVTLTYDVKLDDSFISNKFYDTNGRTTLNPKSEDPNTLRDFPIPKIRDVREYPTITIKNEKKLGEIEFIKVDKDNNKLLLKGATFELQEFNEDYKLYLPIKNNNSKVVTGENGKISYKDLKDGKYQLIEAVSPEDYQKITNKPILTFEVVKGSIKNIIAVNKQISEYHEEGDKHLITN THIPPKGIIPMTGGKGILS

GBS 067 contains an amino acid motif indicative of a cell wall anchor(an LPXTG (SEQ ID NO: 122) motif): SEQ ID NO: 18 IPMTG. (shown initalics in SEQ ID NO: 16 above). In some recombinant host cell systems,it may be preferable to remove this motif to facilitate secretion of arecombinant GBS 067 protein from the host cell. Accordingly, in onepreferred fragment of GBS 067 for use in the invention, thetransmembrane and the cell wall anchor motif are removed from GBS 67. Anexample of such a GBS 067 fragment is set forth below as SEQ ID NO: 19.SEQ ID NO: 19 MRKYQKFSKILTLSLFCLSQIPLNTNVLGESTVPENGAKGKLVVKKTDDQNKPLSKATFVLKTTAHPESKIEKVTAELTGEATFDNLIPGDYTLSEETAPEGYKKTNQTWQVKVESNGKTTIQNSGDKNSTIGQNQEELDKQYPPTGIYEDTKESYKLEHVKGSVPNGKSEAKAVNPYSSEGEHIREIPEGTLSKRTSEVGDLAHNKYKIELTVSGKTIVKPVDKQKPLDVVFVLDNSNSMNNDGPNFQRHNKAKKPAEALGTAVKDILGANSDNRVALVTYGSDIFDGRSVDVVKGFKEDDKYYGLQTKFTIQTENYSHKQLTNNAEEIIKRIPTEAPKAKWGSTTNGLTPEQQKEYYLSKVGETFTMKAFMEADDILSQVNRNSQKIIVHVTDGVPTRSYAINNFKLGASYESQFEQMKKNGYLNKSNFLLTDKPEDIKGNGESYFLFPLDSYQTQIISGNLQKLHYLDLNLNYPKGTIYRNGPVKEHGTPTKLYINSLKQKNYDIFNFGIDISGERQVYNEEYKKNQDGTFQKLKEEAFKLSDGEITELMRSFSSKPEYYTPIVTSADTSNNEILSKIQQQFETILTKENSIVNGTIEDPMGDKINLQLGNGQTLQPSDYTLQGNDGSVMKDGIATGGPNNDGGILKGVKLEYIGNKLYVRGLNLGEGQKVTLTYDVKLDDSFTSNKFYDTNGRTTLNPKSEDPNTLRDFPIPKIRDVREYPTITIKNEKKLGEIEFIKVDKDNNKLLLKGATFELQEFNEDYKLYLPIKNNNSKVVTGENGKISYKDLKDGKYQLIEAVSPEDYQKITNKPILTFEVVKGSIKNIIAVNKQISEYHEEGDKHLITN THIPPKGI

Alternatively, in some recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

Three pilin motifs, containing conserved lysine (K) residues have beenidentified in GBS 67. The pilin motif sequences are underlined in SEQ IDNO: 16, below. Conserved lysine (K) residues are marked in bold, atamino acid residues 478 and 488, at amino acid residues 340 and 342, andat amino acid residues 703 and 717. The pilin sequences, in particularthe conserved lysine residues, are thought to be important for theformation of oligomeric, pilus-like structures of GBS 67. Preferredfragments of GBS 67 include at least one conserved lysine residue.Preferably, fragments include at least one pilin sequence. SEQ ID NO: 16MRKYQKFSKILTLSLFCLSQIPLNTNVLGESTVPENGAKGKLVVKKTDDQNKPLSKATFVLKTTAHPESKIEKVTAELTGEATFDNLIPGDYTLSEETAPEGYKKTNQTWQVKVESNGKTTIQNSGDKNSTIGQNQEELDKQYPPTGIYEDTKESYKLEHVKGSVPNGKSEAKAVNPYSSEGEHIREIPEGTLSKRISEVGDLAHNKYKIELTVSGKTIVKPVDKQKPLDVVFVLDNSNSMNNDGPNFQRHNKAKKAAEALGTAVKDILGANSDNRVALVTYGSDIFDGRSVDVVKGFKEDDKYYGLQTKFTIQTENYSHKQLTNNAEEIIKRIPTEAPKAK WGSTTNGLTPEQQKEYYLSKVGETFTMKAFMEADDILSQVNRNSQKIIVHVTDGVPTRSYAINNFKLGASYESQFEQMKKNGYLNKSNFLLTDKPEDIKGNGESYFLFPLDSYQTQIISGNLQKLHYLDLNLNYPKGTIYRNGPVK EHGTPTKLYINSLKQKNYDIFNFGIDISGFRQVYNEEYKKNQDGTFQKLKEEAFKLSDGEITELMRSFSSKPEYYTPIVTSADTSNNEILSKIQQQFETILTKENSIVNGTIEDPMGDKINLQLGNGQTLQPSDYTLQGNDGSVMKDGIATGGPNNDGGILKGVKLEYIGNKLYVRGLNLGEGQKVTLTYDVKLDDSFISNKFYDTNGRTTL NPKSEDPNTLRDFPIPKIRDVREYPTITIKNEKKLGEIEFIKVDKDNNKLLLKGATFELQEFNEDYKLYLPIKNNNSKVVTGENGKISYKDLKDGKYQLIEAVSPEDYQKITNKPILTFEVVKGSIKNIIAVNKQISEYHEEGDKHLITNTHIPPKGIIPMTGGKGILSFILIGGAMMSIAGGIYIWKRYKKSSDMSIKK D

Two E boxes containing conserved glutamic residues have also beenidentified in GBS 67. The E box motifs are underlined in SEQ ID NO: 16below. The conserved glutamic acid (E) residues, at amino acid residues96 and 801, are marked in bold. The E box motifs, in particular theconserved glutamic acid residues, are thought to be important for theformation of oligomeric pilus-like structures of GBS 67. Preferredfragments of GBS 67 include at least one conserved glutamic acidresidue. Preferably, fragments include at least one E box motif. SEQ IDNO: 16 MRKYQKFSKILTLSLFCLSQIPLNTNVLGESTVPENGAKGKLVVKKTDDQNKPLSKATFVLKTTAHPESKIEKVTAELTGEARFDNLIPGDYTLSEETAPEGYKKTNQTWQVKVESNGKTTIQNSGDKNSTIGQNQEELDKQYPPTGIYEDTKESYKLEHVKGSVPNGKSEAKAVNPYSSEGEHIREIPEGTLSKRISEVGDLAHNKYKIELTVSGKTIVKPVDKQKPLDVVFVLDNSNSMNNDGPNFQRHNKAKKAAEALGTAVKDILGANSDNRVALVTYGSDIFDGRSVDVVKGFKEDDKYYGLQTKFTIQTENYSHKQLTNNAEEIIKRIPTEAPKAKWGSTTNGLTPEQQKEYYLSKVGETFTMKAFMEADDILSQVNRNSQKIIVHVTDGVPTRSYAINNFKLGASYESQFEQMKKNGYLNKSNFLLTDKPEDIKGNGESYFLFPLDSYQTQIISGNLQKLHYLDLNLNYPKGTIYRNGPVKEHGTPTKLYINSLKQKNYDIFNFGIDISGFRQVYNEEYKKNQDGTFQKLKEEAFKLSDGEITELMRSFSSKPEYYTPIVTSADTSNNEILSKIQQQFETILTKENSIVNGTIEDPMGDKINLQLGNGQTLQPSDYTLQGNDGSVMKDGIATGGPNNDGGILKGVKLEYIGNKLYVRGLNLGEGQKVTLTYDVKLDDSFISNKFYDTNGRTTLNPKSEDPNTLRDFPIPKIRDVREYPTITIKNEKKLGEIEFIKVDKDNNKLLLKGATFELQEFNEDYKLYLPIKNNNSKVVTGENGKISYKDLKDGKYQLIEAVSPEDYQKITNKPILTFEVVKGSIKNIIAVNKQISEYHEEGDKHLITNTHIPPKGIIPMTGGKGILSFILIGGAMMSIAGGIYIWKRYKKSSDMSIKK D

Predicted secondary structure for the GBS 067 amino acid sequence is setforth in FIG. 33. As shown in this figure, GBS 067 contains severalregions predicted to form alpha helical structures. Such alpha helicalregions are likely to form coiled-coil structures and may be involved inoligomerization of GBS 067.

The amino acid sequence for GBS 067 also contains a region which ishomologous to the Cna_B domain of the Staphylococcus aureuscollagen-binding surface protein (pfam05738). Although the Cna_B regionis not thought to mediate collagen binding, it is predicted to form abeta sandwich structure. In the Staph aureus protein, this beta sandwichstructure is through to form a stalk that presents the ligand bindingdomain away from the bacterial cell surface. This same amino acidsequence region is also predicted to be an outer membrane proteininvolved in cell envelope biogenesis.

The amino acid sequence for GBS 067 contains a region which ishomologous to a von Willebrand factor (vWF) type A domain. The vWF typeA domain is present at amino acid residues 229-402 of GBS 067 as shownin SEQ ID NO: 16. This type of sequence is typically found inextracellular proteins such as integrins and it thought to mediateadhesion, including adhesion to collagen, fibronectin, and fibrinogen,discussed above.

Because applicants have identified GBS 67 as a surface exposed proteinon GBS and because GBS 67 may be involved in GBS adhesion, theimmunogenicity of the GBS 67 protein was examined in mice. The resultsof an immunization assay with GBS 67 are set forth in Table 48, below.TABLE 48 GBS 67 Protects Mice in an Immunization Assay Challenge GBS 67immungen GBS strain dead/ PBS immunogen FACS (serotype) treated %survival dead/treated % survival Δmean 3050 (II) 0/30 100 29/49 41 460CJB111 (V) 76/185 59 143/189 24 481 7357 b (Ib) 34/56  39 65/74 12 316

As shown in Table 48, immunization with GBS 67 provides a substantiallyimproved survival rate for challenged mice relative to negative control,PBS, immunized mice. These results indicate that GBS 67 may comprise animmunogenic composition of the invention.

GBS 59

The following offers examples of GBS 59 fragments. Nucleotide and aminoacid sequences of GBS 59 sequenced from serotype V isolated strain 2603are set forth below as SEQ ID NOS: 125 and 126. The GBS 59 polypeptideof SEQ ID NO: 126 is referred to as SAG1407. SEQ ID NO: 125ttaagcttcctttgattggcgtcttttcatgataactactgctccaagcataatgcttaaaccaataattgtgaaaagaattgtaccaataccacctgtttgtgggattgttacctttttattttctacacgtgtcgcatctttttggttgctgttagcaacgtagtcaatgttaccacctgttatgtatgacccttgattaactacaaacttaatattacctgccaacttagcaaatcctgctggagcaagtgtttcttcaaggttgtaagtaccgtctgcaagacctgtaacttcaaattgaccttgatcgtttgaagtgtaggtaatggctctagccttatctgttatccactcataagctgtacgagcctcaatgaaggctgcatcgtaatctgcttgtttagttttgataagttcttttgcagtaattcctttttcacctttttggtctgttgcagacaacttgttataagcagcgatagcttcatctaaagctattttcttagcagctaaagttttttgaccttctgattgatctgctttaagagcaaggtatttacctgctgagtttttcacaacgaattgtgcaccagccaaacggtcaccttgttcattagttttgacaaatttcttaccatgagtttcaacttttggttcagttgggttcaatggtgttgggttatcagaatctttggtattggtaatggttactttaccattttctagatttattgcacttccgtaaccagaaacacgttctgagatcatgtatgatttgttttctagaccagtgaatttacccgagaagttaccagatacttcaaatttgataccatttccaaggtcgattgtacctttagatgtttttgtcaatgatactgaagcaacagttttatctttatctttcaatgtgtaaacaacgtttacaccatcaggtgcaattccgtcagaccaagttttagcaactgttacttcaccctttgaaggtgtaacaggaagttcagtcaagtctttacctggtttgttaccatacgacaatttgatatcattggattctggattatcaataattgcttgaccattaacagtagcactataagtcaatgtaaattcaatatcagctgttttagctgctttttccaatttgcccaatccatcagctgtgaattttaatgtgaaaccacgggcatcaatgctaagttcatagtctgtatccttagcaaaagtttctgtagttcctgaagctttaaggctaacagttgaacccattgtcaaaccatttgacattatatctgtccaaaccaagttttcgtatttagaacctttgtgaatttttgttttaacttcataaggaacaactttaccgatttcagcagtagcagttgctttgtcacgtgcataattaccataatttgcgccagctgtcaaaagtctattaacatctgtcaatgctgtcaaatcgtttgttttagcaaagtttttatcaatttctggtttttcttcagtgttctttggataaacatgggcatcagcaacaacaccatcttcatttaccaatggaagagtgatgttaactggaaccgcttttgaagcagccaggagggaaccattattgttgtaagtagattttgatttaacttcaacaattttaaactcgcctttcaatcctttggtgttgaaaacaagtccagtatctccctctggtgtcaatccagacacggcctcatcaatatttactgttatttcaggagtaccatctttattaattaaggctggtgttaatttgttaccttcttttgccttaacatattgcactttaccacttttatcttctttcaaagctaaagcaaagaacgcaccttcgatttctttagatccctcgccaaagtaaccagcaaggtcagaaatagctccacctttgtagtcttttccgttaagacctgtagttcctgggaagttacttttgttaagatttgattcggtttgcaaaatcttgtgcaaagtcactgtattagttgttgcttcatccgcaaacgctggtgcaactgagagcaatgacgttaaagtcagtaacaatgccgagaacattgcaaaata tttgttgattcttttcatSEQ ID NO: 126 MKRINKYFAMFSALLLTLTSLLSVAPAFADEATTNTVTLHKILQTESNLNKSNFPGTTGLNGKDYKGGAISDLAGYFGEGSKEIEGAFFALALKEDKSGKVQYVKAKEGNKLTPALINKDGTPEITVNIDEAVSGLTPEGDTGLVFNTKGLKGEFKIVEVKSKSTYNNNGSLLAASKAVPVNITLPLVNEDGVVADAHVYPKNTEEKPEIDKNFAKTNDLTALTDVNRLLTAGANYGNYARDKATATAEIGKVVPYEVKTKIHKGSKYENLVWTDIMSNGLTMGSTVSLKASGTTETFAKDTDYELSIDARGFTLKFTADGLGKLEKAAKTADIEFTLTYSATVNGQAIIDNPESNDIKLSYGNKPGKDLTELPVTPSKGEVTVAKTWSDGIAPDGVNVVYTLKDKDKTVASVSLTKTSKGTIDLGNGIKPEVSGNFSGKETGLENKSYMISERVSGYGSAINLENGKVTITNTKDSDNPTPLNPTEPKVETHGKKFVKTNEQGDRLAGAQFVVKNSAGKYLALKADQSEGQKTLAAKKIALDEAIAAYNKLSATDQKGEKGITAKELIKTKQADYDAAFIEARTAYEWITDKARAITYTSNDQGQFEVTGLADGTYNLEETLAPAGFAKLAGNIKFVVNQGSYITGGNIDYVANSNQKDATRVENKKVTIPQTGGIGTILFTIIGLSIMLGAVVIMKRR QSKEA

Nucleotide and amino acid sequences of GBS 59 sequenced from serotype Visolated strain CJB111 are set forth below as SEQ ID NOS: 127 and 128.The GBS 59 polypeptide of SEQ ID NO: 128 is referred to as BO1575. SEQID NO: 127 ATGAAAAAAATCAACAAATGTCTTACAATGTTCTCGACACTGCTATTGATCTTAACGTCACTATTCTCAGTTGCACCAGCGTTTGCGGACGACGCAACAACTGATACTGTGACCTTGCACAAGATTGTCATGCCACAAGCTGCATTTGATAACTTTACTGAAGGTACAAAAGGTAAGAATGATAGCGATTATGTTGGTAAACAAATTAATGACCTTAAATCTTATTTTGGCTCAACCGATGCTAAAGAAATCAAGGGTGCTTTCTTTGTTTTCAAAAATGAAACTGGTACAAAATTCATTACTGAAAATGGTAAGGAAGTCGATACTTTGGAAGCTAAAGATGCTGAAGGTGGTGCTGTTCTTTCAGGGTTAACAAAAGACAATGGTTTTGTTTTTAACACTGCTAAGTTAAAAGGAATTTACCAAATCGTTGAATTGAAAGAAAAATCAAACTACGATAACAACGGTTCTATCTTGGCTGATTCAAAAGCAGTTCCAGTTAAAATCACTCTGCCATTGGTAAACAACCAAGGTGTTGTTAAAGATGCTCACATTTATCCAAAGAATACTGAAACAAAACCACAAGTAGATAAGAACTTTGCAGATAAAGATCTTGATTATACTGACAACCGAAAAGACAAAGGTGTTGTCTCAGCGACAGTTGGTGACAAAAAAGAATACATAGTTGGAACAAAAATTCTTAAAGGCTCAGACTATAAGAAACTGGTTTGGACTGATAGCATGACTAAAGGTTTGACGTTCAACAACAACGTTAAAGTAACATTGGATGGTGAAGATTTTCCTGTTTTAAACTACAAACTCGTAACAGATGACCAAGGTTTCCGTCTTGCCTTGAATGCAACAGGTCTTGCAGCAGTAGCAGCAGCTGCAAAAGACAAAGATGTTGAAATCAAGATCACTTACTCAGCTACGGTGAACGGCTCCACTACTGTTGAAATTCCAGAAACCAATGATGTTAAATTGGACTATGGTAATAACCCAACGGAAGAAAGTGAACCACAAGAAGGTACTCCAGCTAACCAAGAAATTAAAGTCATTAAAGACTGGGCAGTAGATGGTACAATTACTGATGCTAATGTTGCAGTTAAAGCTATCTTTACCTTGCAAGAAAAACAAACGGATGGTACATGGGTGAACGTTGCTTCACACGAAGCAACAAAACCATCACGCTTTGAACATACTTTCACAGGTTTGGATAATGCTAAAACTTACCGCGTTGTCGAACGTGTTAGCGGCTACACTCCAGAATACGTATCATTTAAAAATGGTGTTGTGACTATCAAGAACAACAAAAACTCAAATGATCCAACTCCAATCAACCCATCAGAACCAAAAGTGGTGACTTATGGACGTAAATTTGTGAAAACAAATCAAGCTAACACTGAACGCTTGGCAGGAGCTACCTTCCTCGTTAAGAAAGAAGGCAAATACTTGGCACGTAAAGCAGGTGCAGCAACTGCTGAAGCAAAGGCAGCTGTAAAAACTGCTAAACTAGCATTGGATGAAGCTGTTAAAGCTTATAACGACTTGACTAAAGAAAAACAAGAAGGCCAAGAAGGTAAAACAGCATTGGCTACTGTTGATCAAAAACAAAAAGCTTACAATGACGCTTTTGTTAAAGCTAACTACTCATATGAATGGGTTGCAGATAAAAAGGCTGATAATGTTGTTAAATTGATCTCTAACGCCGGTGGTCAATTTGAAATTACTGGTTTGGATAAAGGCACTTATGGCTTGGAAGAAACTCAAGCACCAGCAGGTTATGCGACATTGTCAGGTGATGTAAACTTTGAAGTAACTGCCACATCATATAGCAAAGGGGCTACAACTGACATCGCATATGATAAAGGCTCTGTAAAAAAAGATGCCCAAGAAGTTCAAAACAAAAAAGTAACCATCCCACAAACAGGTGGTATTGGTACAATTCTTTTCACAATTATTGGTTTAAGCATTATGCTTGGAGCAGTAGTTATCATGAAAAAACGTCAATCAGAGGAAGCTTAA SEQ ID NO: 128MKKINKCLTMFSTLLLILTSLFSVAPAFADDATTDTVTLHKIVMPQAAFDNFTEGTKGKNDSDYVGKQINDLKSYFGSTDAKEIKGAFFVFKNETGTKFITENGKEVDTLEAKDAEGGAVLSGLTKDNGFVFNTAKLKGIYQIVELKEKSNYDNNGSILADSKAVPVKITLPLVNNQGVVKDAHIYPKNTETKPQVDKNFADKDLDYTDNRKDKGVVSATVGDKKEYIVGTKILKGSDYKKLVWTDSMTKGLTFNNNVKVTLDGEDFPVLNYKLVTDDQGFRLALNATGLAAVAAAAKDKDVEIKITYSATVNGSTTVEIPETNDVKLDYGNNPTEESEPQEGTPANQEIKVIKDWAVDGTITDANVAVKAIFTLQEKQTDGTWVNVASHEATKPSRFEHTFTGLDNAKTYRVVERVSGYTPEYVSFKNGVVTIKNNKNSNDPTPINPSEPKVVTYGRKFVKTNQANTERLAGATFLVKKEGKYLARKAGAATAEAKAAVKTAKLALDEAVKAYNDLTKEKQEGQEGKTALATVDQKQKAYNDAFVKANYSYEWVADKKADNVVKLISNAGGQFEITGLDKGTYGLEETQAPAGYATLSGDVNFEVTATSYSKGATTDIAYDKGSVKKDAQQVQNKKVTIPQTGGIGTILFTIIGLSIMLGAVVIMKKRQSEEA

The GBS 59 polypeptides contain an amino acid motif indicative of a cellwall anchor: SEQ ID NO: 129 IPQTG (shown in italics in SEQ ID NOs: 126and 128 above). In some recombinant host cell systems, it may bepreferable to remove this motif to facilitate secretion of a recombinantGBS 59 protein from the host cell. Alternatively, in some recombinanthost cell systems, it may be preferable to use the cell wall anchormotif to anchor the recombinantly expressed protein to the cell wall.The extracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

Pilin motifs, containing conserved lysine (K) residues have beenidentified in the GBS 59 polypeptides. The pilin motif sequences areunderlined in each of SEQ ID NOs: 126 and 128, below. Conserved lysine(K) residues are marked in bold. The conserved lysine (K) residues arelocated at amino acid residues 202 and 212 and amino acid residues 489and 495 of SEQ ID NO: 126 and at amino acid residues 188 and 198 of SEQID NO: 128. The pilin sequences, in particular the conserved lysineresidues, are thought to be important for the formation of oligomeric,pilus-like structures of GBS 59. Preferred fragments of GBS 59 includeat least one conserved lysine residue. Preferably, fragments include atleast one pilin sequence. SEQ ID NO: 126MKRINKYFAMFSALLLTLTSLLSVAPAFADEATTNTVTLHKILQTESNLNKSNFPGTTGLNGKDYKGGATSDLAGYFGEGSKSIEGAFFALALKEDKSGKVQYVKAKEGNKLTPALINKDGTPEITVNIDEAVSGLTPEGDTGLVFNTKGLKGEFKIVEVKSKSTYNNNGSLLAASKAVPVNITLPLVNEDGVVADAHVY PKNTEEKPEIDKNFAKTNDLTALTDVNRLLTAGANYGNYARDKATATAETGKVVPYEVKTKIHKGSKYENLVWTDIMSNGLTMGSTVSLKASGTTETFAKDTDYELSIDARGFTLKFTADGLGKLEKAAKTADIEFTLTYSATVNGQAIIDNPESNDIKLSYGNKPGKDLTELPVTPSKGEVTVAKTWSDGIAPDGVNVVYTLKDKDKTVASVSLTKTSKGTIDLGNGIKEEVSGNFSGKFTGLENKSYMISERVSGYGSAINLENGKVTITNTKDSDNPTPLNPTEPKVETHGK KFVKTNEQGDRLAGAQFVVKNSAGKYLALKADQSEGQKTLAAKKIALDEAIAAYNKLSATDQKGEKGITAKELIKTKQADYDAAFIEARTAYEWITDKARAITYTSNDQGQFEVTGLADGTYNLEETLAPAGFAKLAGNIKFVVNQGSYITGGNIDYVANSNQKDATRVENKKVTIPQTGGIGTILFTIIGLSIMLGAVVIMKRR QSKEA SEQ ID NO: 128MKKINKCLTMFSTLLLILTSLFSVAPAFADDATTDTVTLHKIVMPQPAFDNFTEGTKGKNDSDYVGKQINDLKSYFGSTDAKEIKGAFFVFKNETGTKFITENGKEVDTLEAKDAEGGAVLSGLTKDNGFVFNTAKLKGIYQIVELKEKSNYDNNGSILADSKAVPVKITLPLVNNQGVVKDAHIYPKNTETKPQVDK NFADKDLDYTDNRKDKGVVSATVGDKKEYIVGTKILKGSDYKKLVWTDSMTKGLTFNNNVKVTLDGEDFPVLNYKLVTDDQGFRLALNATGLAAVAAAAKDKDVEIKITYSATVNGSTTVEIPETNDVKLDYGNNPTEESEPQEGTPANQEIKVIKDWAVDGTITDANVAVKAIFTLQEKQTDGTWVNVASHEATKPSRFEHTFTGLDNAKTYRVVERVSGYTPEYVSFKNGVVTIKNNKNSNDPTPINPSEPKVVTYGRKFVKTNQANTERLAGATFLVKKEGKYLARKAGAATAEAKAAVKTAKLALDEAVKAYNDLTKEKQEGQEGKTALATVDQKQKAYNDAFVKANYSYEWVADKKADNVVKLISNAGGQFEITGLDKGTYGLEETQAPAGYATLSGDVNFEVTATSYSKGATTDIAYDKGSVKKDAQQVQNKKVTIPQTGGIGTILFTIIGLSIMLGAVVIMKKRQSEEA

An E box containing a conserved glutamic residue has also beenidentified in each of the GBS 59 polypeptides. The E box motif isunderlined in each of SEQ ID NOs: 126 and 128 below. The conservedglutamic acid (E) is marked in bold at amino acid residue 621 in SEQ IDNO: 126 and at amino acid residue 588 in SEQ ID NO: 128. The E boxmotif, in particular the conserved glutamic acid residue, is thought tobe important for the formation of oligomeric pilus-like structures ofGBS 59. Preferred fragments of GBS 59 include the conserved glutamicacid residue. Preferably, fragments include the E box motif. SEQ ID NO:126 MKRINKYFAMFSALLLTLTSLLSVAPAFADEATTNTVTLHKILQTESNLNKSNFPGTTGLNGKDYKGGAISDLAGYFGEGSKEIEGAFFALALKEDKSGKVQYVKAKEGNKLTPALINKDGTPEITVNIDEAVSGLTPEGDTGLVFNTKGLKGEFKIVEVKSKSTYNNNGSLLAASKAVPVNITLPLVNEDGVVADAHVYPKNTEEKPEIDKNFAKTNDLTALTDVNRLLTAGANYGNYARDKATATAEIGKVVPYEVKTKIHKGSKYENLVWTDIMSNGLTMGSTVSLKASGTTETFAKDTDYELSIDARGFTLKFTADGLGKLEKAAKTADIEFTLTYSATVNGQAIIDNPESNDIKLSYGNKPGKDLTELPVTPSKGEVTVAKTWSDGIAPDGVNVVYTLKDKDKTVASVSLTKTSKGTIDLGNGIKFEVSGNFSGKFTGLENKSYMISERVSGYGSAINLENGKVTITNTKDSDNPTPLNPTEPKVETHGKKEVKTNEQGDRLAGAQFVVKNSAGKYLALKADQSEGQKTLAAKKIALDEAIAAYNKLSATDQKGEKGITAKELIKTKQADYDAAFIEARTAYEWITDKARAITYTSNDQGQFEVTGLADGTYNLEETLAPAGFAKLAGNIKEVVNQGSYITGGNIDYVANSNQKDATRVENKKVTIPQTGGIGTILFTIIGLSIMLGAVVIMKRR QSKEA SEQ ID NO: 128MKKINKCLTMFSTLLLILTSLFSVAPAFADDATTDTVTLHKIVMPQAAFDNFTEGTKGKNDSDYVGKQINDLKSYFGSTDAKEIKGAFFVFKNETGTKFITENGKEVDTLEAKDAEGGAVLSGLTKDNGFVFNTAKLKGIYQIVELKEKSNYDNNGSILADSKAVPVKITLPLVNNQGVVKDAHIYPKNTETKPQVDKNFADKDLDYTDNRKDKGVVSATVGDKKEYTVGTKILKGSDYKKLVWTDSMTKGLTFNNNVKVTLDGEDFPVLNYKLVTDDQGFRLALNATGLAAVAAAAKDKDVEIKITYSATVNGSTTVEIPETNDVKLDYGNNPTEESEPQEGTPANQEIKVIKDWAVDGTITDANVAVKAIFTLQEKQTDGTWVNVASHEATKPSRFEHTFTGLDNAKTYRVVERVSGYTPEYVSFKNGVVTIKNNKNSNDPTPINPSEPKVVTYGRKFVKTNQANTERLAGATFLVKKEGKYLARKAGAATAEAKAAVKTAKLALDEAVKAYNDLTKEKQEGQEGKTALATVDQKQKAYNDAFVKANYSYEWVADKKADNVVKLISNAGGQFEITGLDKGTYGLEETQAPAGYATLSGDVNFEVTATSYSKGATTDIAYDKGSVKKKAQQVQNKKVTIPQTGGIGTILFTIIGLSIMLGAVVIMKKRQSEEA

Female mice were immunized with either SAG1407 (SEQ ID NO: 126) orBO1575 (SEQ ID NO: 128) in an active maternal immunization assay. Pupsbred from the immunized female mice survived GBS challenge better thancontrol (PBS) treated mice. Results of the active maternal immunizationassay using the GBS 59 immunogenic compositions are shown in Table 17,below. TABLE 17 Active maternal immunization assay for GBS 59 ChallengeGBS 59 PBS GBS strain Survival Survival (serotype) Dead/treated (%)Dead/treated (%) FACS CJB111 (V)*  7/20 65 41/49 16 493 18RS21 (II)**18/30 40 39/40 2.5 380*immunized with BO1575**immunized with SAG1407

Opsonophagocytosis assays also demonstrated that antibodies against BO1575 are opsonic for GBS serotype V, strain CJB111. See FIG. 67.

GBS 52

Examples of polynucleotide and amino acid sequences for GBS 52 are setforth below. SEQ ID NO: 20 and 21 represent GBS 52 sequences from GBSserotype V, strain isolate 2603. SEQ ID NO: 20ATGAAACAAACATTAAAACTTATGTTTTCTTTTCTGTTGATGTTAGGGACTATGTTTGGAATTAGCCAAACTGTTTTAGCGCAAGAAACTCATCAGTTGACGATTGTTGATCTTGAAGCAAGGGATATTGATCGTCCAAATCCACAGTTGGAGATTGCCCCTAAAGAAGGGACTCCAATTGAAGGAGTACTCTATCAGTTGTACCAATTAAAATCAACTGAAGATGGGGATTTGTTGGCAGATTGGAATTCCCTAACTATCACAGAATTGAAAAAACAGGCGCAGCAGGTTTTTGAAGCGACTACTAATCAACAAGGAAAGGCTACATTTAACCAACTACCAGATGGAATTTATTATGGTCTGGCGGTTAAAGCCGGTGAAAAAAATCGTAATGTCTCAGCTTTCTTGGTTGACTTGTCTGAGGATAAAGTGATTTATCCTAAAATCATCTGGTCCACAGGTGAGTTGGACTTGCTTAAAGTTGGTGTGGATGGTGATACCAAAAAACCACTAGCAGGCGTTGTCTTTGAACTTTATGAAAAGAATGGTAGGACTCCTATTCGTGTGAAAAATGGGGTGCATTCTCAAGATATTGACGCTGCAAAACATTTAGAAACAGATTCATCAGGGCATATCAGAATTTCCGGGCTCATCCATGGGGACTATGTCTTAAAAGAAATCGAGACACAGTCAGGATATCAGATCGGACAGGCAGAGACTGCTGTGACTATTGAAAAATCAAAAACAGTAACAGTAACGATTGAAAATAAAAAAGTTCCGACACCTAAAGTGCCATCTCGAGGAGGTCTTATTCCCAAAACAGGTGAGCAACAGGCAATGGCACTTGTAATTATTGGTGGTATTTTAATTGCTTTAGCCTTACGATTACTATCAAAACAT CGGAAACATCAAAATAAGGATSEQ ID NO: 21 MKQTLKLMFSFLLMLGTMFGISQTVLAQETHQLTIVHLEARDIDRPNPQLEIAPKEGTPIEGVLYQLYQLKSTEDGDLLAHWNSLTITELKKQAQQVFEATTNQQGKATFNQLPDGIYYGLAVKAGEKNRNVSAFLVDLSEDKVIYPKIIWSTGELDLLKVGVDGDTKKPLAGVVFELYEKNGRTPTRVKNGVHSQDIDAAKHLETDSSGHIRISGLIHGDYVLKEIETQSGYQIGQAETAVTIEKSKTVTVTIENKKVPTPKVPSRGGLIPKTGEQQAMALVIIGGILIALALRLLSKH RKHQNKD

GBS 52 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 124 IPKTG (shown in italics in SEQ ID NO: 21, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant GBS 52 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in GBS 52. The pilin motif sequence isunderlined in SEQ ID NO: 21, below. Conserved lysine (K) residues arealso marked in bold, at amino acid residues 148 and 160. The pilinsequence, in particular the conserved lysine residues, are thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of GBS 52 include at least one conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO: 21MKQTLKLMFSFLLMLGTMFGISQTVLAQETHQLTIVHLEARDIDRPNPQLEIAPKEGTPIEGVLYQLYQLKSTEDGDLLAHWNSLTITELKKQAQQVFEATTNQQGKATFNQLPDGIYYGLAVKAGEKNRNVSAFLVDLSEDKVIYPKII WSTGELDLLKVGVDGDTKKPLAGVVFELYEKNGRTPTRVKNGVHSQDIDAAKHLETDSSGHIRISGLIHGDYVLKEIETQSGYQIGQAETAVTIEKSKTVTVTIENKKVPTPKVPSRGGLIPKTGEQQAMALVIIGGILIALALRLLSKH RKHQNKD

An E box containing a conserved glutamic residue has been identified inGBS 52. The E-box motif is underlined in SEQ ID NO: 21, below. Theconserved glutamic acid (E), at amino acid residue 226, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of GBS 52. Preferred fragments of GBS 52 includethe conserved glutamic acid residue. Preferably, fragments include the Ebox motif. SEQ ID NO: 21MKQTLKLMFSFLLMLGTMFGISQTVLAQETHQLTIVHLEARDIDRPNPQLEIAPKEGTPIEGVLYQLYQLKSTEDGDLLAHWNSLTITELKKQAQQVFEATTNQQGKATFNQLPDGIYYGLAVKAGEKNRNVSAFLVDLSEDKVIYPKIIWSTGELDLLKVGVDGDTKKPLAGVVFELYEKNGRTPTRVKNGVHSQDIDAAKHLETDSSGHIRISGLIHGDYVLKEIETQSGYQIGQAETAVTIEKSKTVTVTIENKKVPTPKVPSRGGLIPKTGEQQAMALVIIGGILIALALRLLSKH RKHQNKDSAG0647

Examples of polynucleotide and amino acid sequences for SAG0647 are setforth below. SEQ ID NO: 22 and 23 represent SAG0647 sequences from GBSserotype V, strain isolate 2603. SEQ ID NO: 22ATGGGACAAAAATCAAAAATATCTCTAGCTACGAATATTCGTATATGGATTTTTCGTTTAATTTTCTTAGCGGGTTTCCTTGTTTTGGCATTTCCCATCGTTAGTCAGGTCATGTACTTTCAAGCCTCTCACGCCAATATTAATGCTTTTAAAGAAGGTGTTACCAAGATTGACCGGGTGGAGATTAATCGGCGTTTAGAACTTGCTTATGCTTATAAGGCCAGTATAGCAGGTGCCAAAACTAATGGCGAATATCCAGCGCTTAAAGACCCCTACTCTGGTGAACAAAAGCAGGCAGGGGTCGTTGAGTACGCCCGCATGCTTGAAGTCAAAGAACAAATAGGTCATGTGATTATTCCAAGAATTAATCAGGATATCCCTATTTACGCTGGCTCTGCTGAAGAAAATCTTCAGAGGGGCGTTGGACATTTAGAGGGGACCAGTCTTCCAGTCGGTGGTGAGTCAACTCATGCCGTTGTAACTGCCCATCGAGGGCTACCAACGGCCAAGCTATTTACGAATTTAGACAAGGTAACAGTAGGTGACCGTTTTTACATTGAAGACATCGGCGGAAAGATTGCTTATCAGGTAGACCAAATCAAAGTTATCGGCCCTGATCAGTTAGAGGATTTGTACGTGATTCAAGGAGAAGATCACGTCACCCTATTAACTTGCACAGCTTATATGATAAATAGTCATCGCCTCCTCGTTCGAGGCAAGCGAATTCCTTATGTGGAAAAAACAGTGCAGAAAGATTCAAAGACCTTCAGGCAACAACAATACCTAACCTATGCTATGTGGGTAGTCGTTGGACTTATGTTGCTGTCGCTTCTCATTTGGTTTAAAAAGACGAAACAGAAAAAGCGGAGAAAGAATGAAAAAGCGGCTAGTCAAAATAGT CACAATAATTCGAAATAASEQ ID NO: 23 MGQKSKISLATNIRIWIERLIFLAGFLVLAFPIVSQVMYFQASHANINAFKEAVTKIDRVEINRRLELAYAYNASIAGAKTNGEYPALKDPYSAEQKQAGVVEYARMLEVKEQIGHVIIPRINQDIPIYAGSAEENLQRGVGHLEGTSLPVGGESTHAVLTAHRGLPTAKLFTNLDKVTVGDRFYIEHIGGKIAYQVDQIKVIAPDQLEDLYVIQGEDHVTLLTCTPYMINSHRLLVRGKRIPYVEKTVQKDSKTFRQQQYLTYAMWVVVGLILLSLLIWFKKTKQKKRRKNEKAASQNS HNNSKSAG0648

Examples of polynucleotide and amino acid sequences for SAG0648 are setforth below. SEQ ID NO: 24 and 25 represent SAG0648 sequences from GBSserotype V, strain isolate 2603. SEQ ID NO: 24ATGGGAAGTCTGATTCTCTTATTTCCGATTGTGAGCCAGGTAAGTTACTACCTTGCTTCGCATCAAAATATTAATCAATTTAAGCGGGAAGTCGCTAAGATTGATACTAATACGGTTGAACGACGCATCGCTTTAGCTAATGCTTACAATGAGACGTTATCAAGGAATCCCTTGCTTATAGACCCTTTTACCAGTAAGCAAAAAGAAGGTTTGAGAGAGTATGCTCGTATGCTTGAAGTTCATGAGCAAATAGGTCATGTGGCAATCCCAAGTATTGGGGTTGATATTCCAATTTATGCTGGAACATCCGAAACTGTGCTTCAGAAAGGTAGTGGGCATTTGGAGGGAACCAGTCTTCCAGTGGGAGGTTTGTCAACCCATTCAGTACTAACTGCCCACCGTGGCTTGCCAACAGCTAGGCTATTTACCGACTTAAATAAAGTTAAAAAAGGCCAGATTTTCTATGTGACGAACATCAAGGAAACACTTGCCTACAAAGTCGTGTCTATCAAAGTTGTGGATCCAACAGCTTTAAGTGAGGTTAAGATTGTCAATGGTAAGGATTATATAACCTTGCTGACTTGCACACCTTACATGATCAATAGTCATGGTCTCTTGGTAAAAGGAGAGCGTATTCCTTATGATTCTACCGAGGCGGAAAAGCACAAAGAACAAACGGTACAAGATTATCGTTTGTCACTAGTGTTGAAGATACTACTAGTATTATTAATTGGACTCTTCATCGTGATAATGATGAGAAGATGGATGGAACATCGTCAATAA SEQ ID NO: 25MGSLILLFPIVSQVSYYLASHQNINQFKREVAKIDTNTVERRIALANAYNETLSRNPLLIDPFTSKQKEGLREYARMLEVHEQTGHVATPSIGVDIPIYAGTSETVLQKGSGHLEGTSLPVGGLSTHSVLTAHRGLPTARLFTDLNKVKKGQIFYVTNIKETLAYKVVSIKVVDPTALSEVKIVNGKDYITLLTCTPYMINSHRLLVKGERIPYDSTEAEKHKEQTVQDYRLSLVLKILLVLLIGLFIVI MMRRWMQHRQGBS 150

Examples of polynucleotide and amino acid sequences for GBS 150 are setforth below. SEQ ID NO: 26 and 27 represent GBS 150 sequences from GBSserotype V, strain isolate 2603. SEQ ID NO: 26ATGAAAAAGATTAGAAAAAGTTTAGGACTTCTAGTATGTTGGTTTTTAGGATTGGTACAATTAGCGTTTTTTTCGGTAGGCAGTGTAAATGCTGATACCCCTAATCAACTAACAATCACAGAGATAGGACTTCAGCGAAATACTACAGAGGAGGGGATTTCTTATGGTTTATGGACTGTGACTGACAAGTTAAAAGTTGATTTATTGAGGCAAATGACAGATAGCGAATTGAAGGAGAAGTATAAGAGTATGTTGACTTCTCCTAGTGATACTAATGGTCAGACAAAGATAGGACTGGGAAATGGTTCGTACTTTGGTCGTGCTTATAAAGCTGATGAAAGGGTTTCAACAATAGTACCTTTTTATATTGAATTAGGAGATGATAAGTTATGAAATCAATTACAGATAAATCGTAAGCGAAAAGTTGAAACAGGCGGATTAAAACTTATTAAATATACAAAAGAAGGAAAGATAAAGAAAAGGCTATCCGGAGTAATATTTGTATTATACGATAACCAGAATGAGGGAGTTCGCTTTAAAAATGGACGATTTACGACGGATCAAGATGGGATTAGTTGATTAGTAACTGATGATAAGGGAGAAATTGAGGTTGAAGGTTTATTACGTGGTAAGTATATTTTTCGAGAAGCAAAAGCACTAACTGGTTACCGTATATGTATGAAGGATGGTGTAGTTGCTGTAGTTGCTAATAAAACACAGGAAGTAGAGGTAGAAAAcGAAAAAGAAACTCCTCCACCAACAAATCCTAAACCATCACAACCGCTTTTTGCACAATCATTTCTTCCTAAAACAGGAATGATTATTGGTGGAGGACTGACAATTCTTGGTTGTATTATTTTGGGAATTTTGTTTATCTTTTTAAGAAAAACTAAAAATAGCAAATCTGAAAGAAACGATACAGTA SEQ ID NO: 27MKKIRKSLGLLLCCFLGLVQLAFESVASVNADTPNQLTITQIGLQPNTTEEGISYRLWTVTDNLKVDLLSQMTDSELNQKYKSILTSPTDTNGQTKIALPNGSYFGRAYKADQSVSTIVPEYIELPDDKLSNQLQINPKRKVETGRLKLIKYTKEGKIKKRLSGVIFVLYDNQNQPVRFKNGRFTTDQDGITSLVTDDKGEIEVEGLLPGKYIFREAKALTGYRISMKDAVVAVVANKTQEVEVENEKETPPPTNPKPSQPLFPQSFLPKTGMIIGGGLTILGCIILGILFIFLRKTKNS KSERNDTV

GBS 150 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 130 LPKTG (shown in italics in SEQ ID NO: 27 above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant GBS 150 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

As discussed above, a pilin motif, containing a conserved lysine (K)residue has been identified in GBS 150. The pilin motif sequence isunderlined in SEQ ID NO: 27, below. Conserved lysine (K) residues aremarked in bold, at amino acid residues 139 and 148. The pilin sequence,in particular the conserved lysine residues, are thought to be importantfor the formation of oligomeric, pilus-like structures of GBS 150.Preferred fragments of GBS 150 include a conserved lysine residue.Preferably, fragments include the pilin sequence. SEQ ID NO: 27MKKIRKSLGLLLCCFLGLVQLAFFSVASVNADTPNQLTITQIGLQPNTTEEGISYRLWTVTDNLKVDLLSQMTDSELNQKYKSILTSPTDTNGQTKIALPNGSYFGRAYKADQSVSTIVPFYIELPDDKLSNQLQINPKRKVETGRLK LIKYTKEGKIKKRLSGVIFVLYDNQNQPVRFKNGRFTTDQDGITSLVTDDKGEIEVEGLLPGKYIFREAKALTGYRISMKDAVVAVVANKTQEVEVENEKETPPPTNPKPSQPLFPQSFLPKTGMIIGGGLTILGCTTLGILFIFLRKTKNS KSERNDTV

An E box containing a conserved glutamic residue has also beenidentified in GBS 150. The E box motif is underlined in SEQ ID NO: 27below. The conserved glutamic acid (E), at amino acid residue 216, ismarked in bold. The E box motif, in particular the conserved glutamicacid residue, is thought to be important for the formation of oligomericpilus-like structures of GBS 150. Preferred fragments of GBS 150 includethe conserved glutamic acid residue. Preferably, fragments include the Ebox motif. SEQ ID NO: 27MKKIRKSLGLLLCCFLGLVQLAFFSVASVNADTPNQLTITQIGLQPNTTEEGISYRLWTVTDNLKVDLLSQMTDSELNQKYKSILTSPTDTNGQTKIALPNGSYFGRAYKADQSVSTIVPFYIELPDDKLSNQLQINPKRKVETGRLKLIKYTKEGKIKKRLSGVIFVLYDNQNQPVRFKNGRFTTDQDGITSLVTDDKGEIEVEGLLPGKYIFREAKALTGYRISMKDAVVAVVANKTQEVEVENEKETPPPTNPKPSQPLFPQSFLPKTGMIIGGGLTILGCIILGILFIPLRKTKNS KSERNDTV

Examples of polynucleotide and amino acid sequences for SAG1405 are setforth below. SEQ ID NO: 28 and 29 represent SAG1405 sequences from GBSserotype V, strain isolate 2603. SEQ ID NO: 28ATGGGAGGAAAATTTCAGAAAAACCTTAAGAAATCGGTCGTTTTAAATCGATGGATGAATGTAGGCTTGATACTATTGTTCTTAGTTGGTCTTTTGATAACCTCATATCCTTTTATTTCAAATTGGTACTATAATATTAAAGCTAATAATCAAGTAACTAACTTTGATAATCAAACCCAAAAATTAAATACTAAAGAGATTAATAGACGATTTGAGTTAGCAAAAGCTTATAATAGAACACTGGACCCAAGCCGCCTATCAGATCCCTATACTGAAAAAGAAAAAAAAGGTATTGCTGAATACGCCCACATGCTTGAGATTGCTGAAATGATTGGATATATTGATATACCGTCTATCAAGCAAAAATTACCTATCTATGCGGGGACTACCAGTAGTGTTCTTGAAAAAGGAGCAGGACACCTTGAAGGAACCTCCTTGCCAATTGGTGGAAAAAGTTCACATACTGTTATCACAGCTCATCGCGGCTTACCTAAAGCTAAGTTATTTACAGATTTAGATAAACTTAAAAAAGGAAAAATTTTTTATATTCATAATATCAAAGAAGTTTTAGCCTATAAGGTTGATCAAATAAGTGTTGTAAAGCCAGATAATTTTTCTAAATTATTGGTTGTTAAAGGTAAGGATTATGCGACTTTGCTAACATGTACACCTTATTCGATTAATTCACATCGTTTACTAGTTAGAGGGCATCGAATCAAGTATGTACCTCCTGTTAAAGAAAAGAACTATTTAATGAAAGAATTGCAAACACACTATAAACTTTATTTCCTCTTATCAATCCTAGTTATTCTTATATTAGTCGCTTTACTATTATATTTAAAACGAAAATTTAAAGAGAGAAAGAGAAAGGGAAATCAAAAATGA SEQ ID NO: 29MGGKFQKNLKKSVVLNRWMNVGLILLFLVGLLITSYPFISNWYYNIKANNQVTNFDNQTQKLNTKEINRRFELAKAYNRTLDPSRLSDPYTEKEKKGIAEYAHMLEIAEMIGYIDIPSIKQKLPIYAGTTSSVLEKGAGHLEGTSLPIGGKSSHTVITAHRGLPKAKLFTDLDKLKKGKIFYIHNIKEVLAYKVDQISVVKPDNFSKLLVVKGKDYATLLTCTPYSINSHRLLVRGHRIKYVPPVKEKNYLMKELQTRYKLYFLLSILVILILVALLLYLKRKPKERKRKGNQKSAG1406

Examples of polynucleotide and amino acid sequences for SAG1405 are setforth below. SEQ ID NO: 30 and 31 represent SAG1405 sequences from GBSserotype V, strain isolate 2603. SEQ ID NO: 30GTGAAGACTAAAAAAATCATCAAAAAAACAAAAAAAAAGAAGAAGTCAAATCTTCCTTTTATCATTCTTTTTCTAATAGGTCTATCTATTTTATTGTATCCAGTGGTATCACGTTTTTACTATACGATAGAATCTAATAATCAAACACAGGATTTTGAGAGAGCTGCTAAAAAACTTAGTCAGAAAGAAATCAATCGACGTATGGCTCTAGCACAAGCTTATAATGATTCTTTAAATAATGTCCATCTTGAAGATCCTTATGAGAAAAAACGAATTCAAAAGGGGGTAGCAGAGTACGCCCGTATGTTAGAGGTAAGTGAAAAAATCGGAACAATTTCAGTTCCTAAGATAGGTCAAAAACTCCCTATATTTGCAGGTTCAAGTCAAGAAGTTCTATCTAAAGGAGCAGGGCATTTAGAAGGTACCTCTCTTCCAATTGGGGGCAATAGTACACATACTGTTATAACAGCGCATTCAGGAATTCCAGATAAAGAACTCTTTTCTAACCTTAAAAAGTTAAAAAAAGGAGATAAGTTTTATATTCAAAACATAAAAGAAACGATAGCATATCAAGTAGATCAGATAAAAGTCGTTACACCCGATAACTTTTCAGATTTGTTGGTTGTTCCTGGACATGATTATGCAACCTTATTGACTTGCACCCCGATTATGATCAATACACACAGACTTTTAGTAAGGGGACATCGTATCCCTTATAAAGGTCCTATTGATGAAAAATTAATAAAAGACGGTCATTTAAACACGATTTATAGATATCTATTCTATATATCTTTAGTTATTATTGCTTGGTTACTTTGGTTAATAAAACGTCAACGTCAAAAAAATCGTTTAGCAAGTGTTAGAAAAGGAATTGAATCATAA SEQ ID NO: 31MKTKKIIKKTKKKKKSNLPFIILFLIGLSILLYPVVSRFYYTIESNNQTQDFERAAKKLSQKEINRRMALAQAYNDSLNNVHLEDPYEKKRIQKGVAEYARMLEVSEKIGTISVPKIGQKLPIFAGSSQEVLSKGAGHLEGTSLPIGGNSTHTVITAHSGIPDKELFSNLKKLKKGDKFYIQNIKETIAYQVDQIKVVTPDNFSDLLVVPGHDYATLLTCTPTMINTHRLLVRGHRIPYKGPIDEKLIKDGHLNTIYRYLFYISLVIIAWLLWLIKRQRQKNRLASVRKGIES01520

An example of an amino acid sequence for 01520 is set forth below. SEQID NO: 32 represents a 01520 sequence from GBS serotype III, strainisolate COH1. SEQ ID NO: 32MIRRYSANFLAILGIILVSSGIYWGWYNINQAHQADLTSQHIVKVLDKSITHQVKGSENGELPVKKLDKTDYLGTLDIPNLKLHLPVAANYSFEQLSKTPTRYYGSYLTNNMVICAHNFPYHFDALKNVDMGTDVYFTTTTGQIYHYKISNREIIEPTAIEKVYKTATSDNDWDLSLFTCTKAGVARVLVRCQLIDVKN01521

An example of an amino acid sequence for 01521 is set forth below. SEQID NO: 33 represents a 01521 sequence from GBS serotype III, strainisolate COH1. SEQ ID NO: 33MIYKKILKITLLLLFSLSTQLVSADTNDQMKTGSITIQNKYNNQGIAGGNLLVYQVAQAKDVDGNQVFTLTTPFQGIGIKDDDLTQVNLDSNQAKYVNLLTKAVHKTQPLQTFDNLPAEGIVANNLPQGIYLFIQTKTAQGYELMSPFILSIPKDGKYDITAFEKMSPLNAKPKKEETITPTVTHQTKGKLPFTGQVWWPIPILIMSGLLCLIIALKWRRRRD

01521 contains an amino acid motif indicative of a cell wall anchor: SEQID NO: 132 LPFTG (shown in italics in SEQ ID NO: 33 above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant 01521 protein from the hostcell. Alternatively, it may be preferable to use the cell wall anchormotif to anchor the recombinantly expressed protein to the cell wall.The extracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

Two pilin motifs, containing conserved lysine (K) residues have beenidentified in 01521. The pilin motif sequences are underlined in SEQ IDNO: 33, below. Conserved lysine (K) residues are marked in bold, atamino acid residues 154 and 165 and at amino acid residues 174 and 188.The pilin sequences, in particular the conserved lysine residues, arethought to be important for the formation of oligomeric, pilus-likestructures of 01521. Preferred fragments of 01521 include at least oneconserved lysine residue. Preferably, fragments include at least onepilin sequence. SEQ ID NO: 33MIYKKILKITLLLLFSLSTQLVSADTNDQMKTGSITIQNKYNNQGIAGGNLLVYQVAQAKDVDGNQVFTLTTPFQGIGIKDDDLTQVNLDSNQAKYVNLLTKAVHKTQPLQTFDNLPAEGIVANNLPQGIYLFIQTKTAQGYELMSPFIL SIPKDGKYDITAFEKMSPLNAKPKKEETITPTVTHQTK GKLPFTGQVWWP TPILIMSGLLCLIIALKWRRRRD

An E box containing a conserved glutamic residue has also beenidentified in 01521. The E box motif is underlined in SEQ ID NO: 33below. The conserved glutamic acid (E), at amino acid residue 177, ismarked in bold. The E box motif, in particular the conserved glutamicacid residue, is thought to be important for the formation of oligomericpilus-like structures of 01521. Preferred fragments of 01521 include theconserved glutamic acid residue. Preferably, fragments include the E boxmotif. SEQ ID NO: 33 MIYKKILKTTLLLLFSLSTQLVSADTNDQMKTGSTTIQNKYNNQGIAGGNLLVYQVAQAKDVDGNQVFTLTTPFQGIGIKDDDLTQVNLDSNQAKYVNLLTKAVHKTQPLQTFDNLPAEGIVANNLPQGIYLFIQTKTAQGYELMSPFTLSIPKDGKYDITAFEKMSPLNAKPKKEETITPTVTHQTKGKLPFTGQWPIP ILIMSGLLCLIIALKWRRRRD01522

An example of an amino acid sequence for 01522 is set forth below. SEQID NO: 34 represents a 01522 sequence from GBS serotype III, strainisolate COH1. SEQ ID NO: 34MAYPSLANYWNSFHQSRAIMDYQDRVTHMDENDYKKITNRAKEYNKQFKTSGMKWHMTSQERLDYNSQLAIDKTGNMGYISIPKINIKLPLYHGTSEKVLQTSIGHLEGSSLPIGGDSTHSILSGHRGLPSSRLFSDLDKLKVGDHWTVSILNETYTYQVDQIRTVKPDDLRDLQIVKGKDYQTLVTCTPYGVNTHRLLVRGHRVPNDNGNALVVAEAIQIEPIYIAPFIAIFLTLILLLISLEVTRRAR QRKKILKQAMRKEENNDL01523

An example of an amino acid sequence for 01523 is set forth below. SEQID NO: 35 represents a 01523 sequence from GBS serotype III, strainisolate COH1. SEQ ID NO: 35MKKKMIQSLLVASLAFGMAVSPVTPIAFAAETGTITVQDTQKGATYKAYKVFDAEIDNANVSDSNKDGASYLIPQGKEAEYKASTDFNSLFTTTTNGGRTYVTKKDTASANEIATWAKSISANTTPVSTVTESNNDGTEVINVSQYGYYYVSSTVNNGAVIMVTSVTPNATIHEKNTDATWGDGGGKTVDQKTYSVGDTVKYTITYKNAVNYHGTEKVYQYVIKDTMPSASVVDLNEGSYEVTITDGSGNITTLTQGSEKATGKYNLLEENNNFTITIPWAATNTPTGNTQNGANDDFFYKGINTITVTYTGVLKSGAKPGSADLPENTNIATINPNTSNDDPGQKVTVRDGQITIKKIDGSTKASLQGAIFVLKNATGQFLNFNDTNNVEWGTEANATEYTTGADGIITITGLKEGTYYLVEKKAPLGYNLLDNSQKVILGDGATDTTNSDNLLVNPTVENNKGTELPSTGGIGTTIEYIIGAILVIGAGIVLVARRRL RS

01523 contains an amino acid motif indicative of a cell wall anchor: SEQID NO: 131 LPSTG (shown in italics in SEQ ID NO: 35 above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant 01523 protein from the hostcell. Alternatively, it may be preferable to use the cell wall anchormotif to anchor the recombinantly expressed protein to the cell wall.The extracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

An E box containing a conserved glutamic residue has also beenidentified in 01523. The E box motif is underlined in SEQ ID NO: 35below. The conserved glutamic acid (E), at amino acid residue 423, ismarked in bold. The E box motif, in particular the conserved glutamicacid residue, is thought to be important for the formation of oligomericpilus-like structures of 01523. Preferred fragments of 01523 include theconserved glutamic acid residue. Preferably, fragments include the E boxmotif. SEQ ID NO: 35 MKKKMIQSLLVASLAFGMAVSPVTPIAFAAETGTITVQDTQKGATYKAYKVFDAEIDNANVSDSNKDGASYLIPQGKEAEYKASTDFNSLFTTTTNGGRTYVTKKDTASANEIATWAKSISANTTPVSTVTESNNDGTEVINVSQYGYYYVSSTVNNGAVIMVTSVTPNATIHEKNTDATWGDGGGKTVDQKTYSVGDTVKYTITYKNAVNYHGTEKVYQYVIKDTMPSASVVDLNEGSYEVTITDGSGNITTLTQGSEKATGKYNLLEENNNFTITIPWAATNTPTGNTQNGANDDFFYKGINTITVTYTGVLKSGAKPGSADLPENTNIATINPNTSNDDPGQKVTVRDGQITIKKIDGSTKASLQGAIFVLKNATGQFLNFNDTNNVEWGTEANATEYTTGADGIITITGLKEGTYYLVEKKAPLGYNLLDNSQKVILGDGATDTTNSDNLLVNPTVENNKGTELPSTGGIGTTIEYIIGAILVIGAGIVLVARRRL RS01524

An example of an amino acid sequence for 01524 is set forth below. SEQID NO: 36 represents a 01524 sequence from GBS serotype m, strainisolate COH1. SEQ ID NO: 36MLKKCQTFIIESLKKKKHPKEWKIIMWSLMILTTFLTTYFLILPAITVEETKTDDVGITLENKNSSQVTSSTSSSQSSVEQSKPQTPASSVTETSSSEEAAYREEPLMFRGADYTVTVTLTKEAKIPKNADLKVTELKDNSATFKDYKKKALTEVAKQDSEIKNFKLYDITIESNGKEAEPQAPVKVEVNYDKPLEASDENLKVVHFKDDGQTEVLKSKDTAETKNTSSDVAFKTDSFSTYAIVQEDNTEVPRLTYHFQNNDGTDYDPLTASGMQVHHQIIKDGESLGEVGIPTIKAGEHFNGWYTYDPTTGKYGDPVKEGEPITVTETKEICVRPFMSKVATVTLYDDSAGKSILERYQVPLDSSGNGTADLSSFKVSPPTSTLLFVGWSKTQNGAPLSESEIQALPVSSDISLYPVFKESYGVEFNTGDLSTGVTYIAPRRVLTGQPASTIKPNDPTRPGYTFAGWYTAASGGAAFDFNQVLTKDTTLYAHWSPAQTTYTINYWQQSATDNKNATDAQKTYEYAGQVTRSGLSLSNQTLTQQDINDKLPTGFKVNNTRTETSVMIKDDGSSVVNVYYDRKLITIKFAKYGGYSLPEYYYSYNWSSDADTYTGLYGTTLAANGYQWKTGAWGYLANVGNNQVGTYGMSYLGEFILPNDTVDSDVIKLFPKGNIVQTYRFTKQGLDGTYSLADTGGGAGADEFTFTEKYLGFNVKYYQRLYPDNYLFDQYASQTSAGVKVPISDEYYDRYGAYHKDYLNLVVWYERNSYKIKYLDPLDNTELPNFPVKDVLYEQNLSSYAPDTTTVQPKPSRPGYVWDGKWYKDQAQTQVFDFNTTMPPHDVKVYAGWQKVTYRVNIDPNGGRLSKTDDTYLDLHYGDRIPDYTDITRDYIQDPSGTYYYKYDSRDKDPDSTKDAYYTTDTSLSNVDTTTKYKYVKDAYKLVGWYYVNPDGSIRPYNFSGAVTQDINLRAIWRKAGDYHIIYSNDAVGTDGKPALDASGQQLQTSNEPTDPDSYDDGSHSALLRRPTMPDGYRFRGWWYNGKIYNPYDSIDIDAHLADANKNITIKPVIIPVGDIKLEDTSIKYNGNGGTRVENGNVVTQVETPRMELNSTTTIPENQYFTRTGYNLIGWHHDKDLADTGRVEFTAGQSIGIDNNPDATNTLYAVWQPKEYTVRVSKTVVGLDEDKTKDFLFNPSETLQQENFPLRDGQTKEEKVPYGTSISIDEQAYDEFKVSESITEKNLATGEADKTYDATGLQSLTVSGDVDISFTNTRIKQKVRLQKVNVENDNNFLAGAVFDIYESDANGNKASHPMYSGLVTNDKGLLLVDANNYLSLPVGKYYLTETKAPPGYLLPKNDISVLVISTGVTFEQNGNNATPIKENLVDGSTVYTFKITNSKGTELPSTGGIGTHIYILVGLALALPSGLILYYRKKI

01524 contains an amino acid motif indicative of a cell wall anchor: SEQID NO: 131 LPSTG (shown in italics in SEQ ID NO: 36 above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant 01524 protein from the hostcell. Alternatively, it may be preferable to use the cell wall anchormotif to anchor the recombinantly expressed protein to the cell wall.The extracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

Three pilin motifs, containing conserved lysine (K) residues have beenidentified in 01524. The pilin motif sequences are underlined in SEQ IDNO: 36, below. Conserved lysine (K) residues are marked in bold, atamino acid residues 128 and 138, amino acid residues 671 and 682, andamino acid residues 809 and 820. The pilin sequences, in particular theconserved lysine residues, are thought to be important for the formationof oligomeric, pilus-like structures of 01524. Preferred fragments of01524 include at least one conserved lysine residue. Preferably,fragments include at least one pilin sequence. SEQ ID NO: 36MLKKCQTFIIESLKKKKHPKEWKIIMWSLMILTTFLTTYFLILPAITVEETKTDDVGITLENKNSSQVTSSTSSSQSSVEQSKPQTPASSVTETSSSEEAAYREEPLMFRGADYTVTVTLTKEAKIPKNADLKVTELK DNSATFKDYKKKALTEVAKQDSEIKNFKLYDITIESNGKEAEPQAPVKVEVNYDKPLEASDENLKVVHFKDDGQTEVLKSKDTAETKNTSSDVAFKTDSFSIYAIVQEDNTEVPRLTYHFQNNDGTDYDPLTASGMQVHHQIIKDGESLGEVGIPTIKAGEHFNGWYTYDPTTGKYGDPVKFGEPITVTETKEICVRPFMSKVATVTLYDDSAGKSILERYQVPLDSSGNGTADLSSFKVSPPTSTLLFVGWSKTQNGAPLSESEIQALPVSSDISLYPVFKESYGVEFNTGDLSTGVTYIAPRRVLTGQPASTIKPNDPTRPGYTFAGWYTAASGGAAFDFNQVLTKDTTLYAHWSPAQTTYTINYWQQSATDNKNATDAQKTYEYAGQVTRSGLSLSNQTLTQQDINDKLPTGFKVNNTRTETSVMIKDDGSSVVNVYYDRKLITIKFAKYGGYSLPEYYYSYNWSSDADTYTGLYGTTLAANGYQWKTGAWGYLANVGNNQVGTYGMSYLGEFILPNDTVDSDVIKLFPKGNIVQTYRFFK QGLDGTYSLADTGGGAGADEFTFTEKYLGFNVKYYQRLYPDNYLFDQYASQTSAGVKVPISDEYYDRYGAYHKDYLNLVVWYERNSYKIKYLDPLDNTELPNFPVKDVLYEQNLSSYA PDTTTVQPKPSRPGYVWDGKWYKDQAQTQVFDFNTTMPPHDVKVYAGWQKVTYRVNIDPNGGRLSKTDDTYLDLHYGDRIPDYTDITRDYIQDPSGTYYYKYDSRDKDPDSTKDAYYTTDTSLSNVDTTTKYKYVKDAYKLVGWYYVNPDGSIRPYNFSGAVTQDINLRAIWRKAGDYHIIYSNDAVGTDGKPALDASGQQLQTSNEPTDPDSYDDGSHSALLRRPTMPDGYRFRGWWYNGKIYNPYDSIDIDAHLADANKNITIKPVIIPVGDIKLEDTSIKYNGNGGTRVENGNVVTQVETPRMELNSTTTIPENQYFTRTGYNLIGWHHDKDLADTGRVEFTAGQSIGIDNNPDATNTLYAVWQPKEYTVRVSKTVVGLDEDKTKDFLFNPSETLQQENFPLRDGQTKEEKVPYGTSISIDEQAYDEFKVSESITEKNLATGEADKTYDATGLQSLTVSGDVDISFTNTRIKQKVRLQKVNVENDNNFLAGAVFDIYESDANGNKASHPMYSGLVTNDKGLLLVDANNYLSLPVGKYYLTETKAPPGYLLPKNDISVLVISTGVTFEQNGNNATPIKENLVDGSTVYTFKITNSKGTELPSTGGIGTHIYILVGLALALPSGLILYYRKKI

An E box containing a conserved glutamic residue has also beenidentified in 01524. The E box motif is underlined in SEQ ID NO: 36below. The conserved glutamic acid (E), at amino acid residue 1344, ismarked in bold. The E box motif, in particular the conserved glutamicacid residue, is thought to be important for the formation of oligomericpilus-like structures of 01524. Preferred fragments of 01524 include theconserved glutamic acid residue. Preferably, fragments include the E boxmotif. SEQ ID NO: 36 MLKKCQTFIIESLKKKKHPKEWKIIMWSLMILTTFLTTYFLILPAITVEETKTDDVGITLENKNSSQVTSSTSSSQSSVEQSKPQTPASSVTETSSSEEAAYREEPLMFRGADYTVTVTLTKEAKIPKNADLKVTELKDNSATFKDYKKKALTEVAKQDSEIKNFKLYDITIESNGKEAEPQAPVKVEVNYDKPLEASDENLKVVHFKDDGQTEVLKSKDTAETKNTSSDVAFKTDSFSTYAIVQEDNTEVPRLTYHFQNNDGTDYDPLTASGMQVHHQIIKDGESLGEVGIPTIKAGEHFNGWYTYDPTTGKYGDPVKEGEPITVTETKEICVRPFMSKVATVTLYDDSAGKSILERYQVPLDSSGNGTADLSSFKVSPPTSTLLFVGWSKTQNGAPLSESEIQALPVSSDISLYPVFKESYGVEFNTGDLSTGVTYIAPRRVLTGQPASTIKPNDPTRPGYTFAGWYTAASGGAAFDFNQVLTKDTTLYAHWSPAQTTYTINYWQQSATDNKNATDAQKTYEYAGQVTRSGLSLSNQTLTQQDINDKLPTGFKVNNTRTETSVMIKDDGSSVVNVYYDRKLITIKFAKYGGYSLPEYYYSYNWSSDADTYTGLYGTTLAANGYQWKTGAWGYLANVGNNQVGTYGMSYLGEFILPNDTVDSDVIKLFPKGNIVQTYRFTKQGLDGTYSLADTGGGAGADEFTFTEKYLGFNVKYYQRLYPDNYLFDQYASQTSAGVKVPISDEYYDRYGAYHKDYLNLVVWYERNSYKIKYLDPLDNTELPNFPVKDVLYEQNLSSYAPDTTTVQPKPSRPGYVWDGKWYKDQAQTQVFDFNTTMPPHDVKVYAGWQKVTYRVNIDPNGGRLSKTDDTYLDLHYGDRIPDYTDITRDYIQDPSGTYYYKYDSRDKDPDSTKDAYYTTDTSLSNVDTTTKYKYVKDAYKLVGWYYVNPDGSIRPYNFSGAVTQDINLRAIWRKAGDYHIIYSNDAVGTDGKPALDASGQQLQTSNEPTDPDSYDDGSHSALLRRPTMPDGYRFRGWWYNGKIYNPYDSIDIDAHLADANKNITIKPVIIPVGDIKLEDTSIKYNGNGGTRVENGNVVTQVETPRMELNSTTTIPENQYFTRTGYNLIGWHHDKDLADTGRVEFTAGQSIGIDNNPDATNTLYAVWQPKEYTVRVSKTVVGLDEDKTKDFLFNPSETLQQENFPLRDGQTKEEKVPYGTSISIDEQAYDEFKVSESITEKNLATGEADKTYDATGLQSLTVSGDVDISFTNTRIKQKVRLQKVNVENDNNFLAGAVFDIYESDANGNKASHPMYSGLVTNDKGLLLVDANNYLSLPVGKYYLTETKAPPGYLLPKNDISVLVISTGVTFEQNGNNATPIKENLVDGSTVYTFKITNSKGTELPSTGGIGTHIYILVGLALALPSGLILYYRKKI01525

An example of an amino acid sequence for 01525 is set forth below. SEQID NO: 37 represents a 01525 sequence from GBS serotype III, strainisolate COH1. SEQ ID NO: 37MKRQISSDKLSQELDRVTYQKRFWSVIKNTIYILMAVASIAILIAVLWLPVLRIYGHSMNKTLSAGDVVFTVKGSNFKTGDVVAFYYNNKVLVKRVTAESGDWVNIDSQGDVYVNQHKLKEPYVIHKALGNSNIKYPYQVPDKKIFVLGDNRKTSIDSRSTSVGDVSEEQIVGKTSFRIWPLGKISSINGBS 322

GBS 322 refers to a surface immunogenic protein, also referred to as“sip”. Nucleotide and amino acid sequences of GBS 322 sequenced fromserotype V isolated strain 2603 V/R are set forth in Ref. 3 as SEQ ID8539 and SEQ OD 8540. These sequences are set forth below as SEQ ID NOS38 and 39: SEQ ID NO.38ATGAATAAAAAGGTACTATTGACATCGACAATGGCAGCTTCGCTATTATCAGTCGCAAGTGTTCAAGCACAAGAAACAGATACGACGTGGACAGCACGTACTGTTTCAGAGGTAAAGGCTGATTTGGTAAAGCAAGACAATAAATCATCATATACTGTGAAATATGGTGATACACTAAGCGTTATTTCAGAAGCAATGTCAATTGATATGAATGTCTTAGCAAAAATAAATAACATTGCAGATATCAATCTTATTTATCCTGAGACAACACTGACAGTAACTTACGATCAGAAGAGTCATACTGCCACTTCAATGAAAATAGAAACACCAGCAACAAATGCTGCTGGTCAAACAACAGCTACTGTGGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCTCTCAATACAATTTCGGAAGGTATGACACCAGAAGCAGCAACAACGATTGTTTGGCCAATGAAGACATATTCTTCTGCGCCAGCTTTGAAATCAAAAGAAGTATTAGCACAAGAGCAAGCTGTTAGTCAAGCAGCAGCTAATGAACAGGTATCACGAGCTCCTGTGAAGTCGATTACTTCAGAAGTTCCAGCAGCTAAAGAGGAAGTTAAACCAACTCAGACGTCAGTCAGTCAGTCAACAACAGTATCACCAGCTTCTGTTGCCGCTGAAACACCAGCTCCAGTAGCTAAAGTAGCACCGGTAAGAACTGTAGCAGCCCCTAGAGTGGCAAGTGTTAAAGTAGTCACTCCTAAAGTAGAAACTGGTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGACTACGACTTCACCAGCTACAGACAGTAAGTTACAAGCGACTGAAGTTAAGAGCGTTCCGGTAGCACAAAAAGCTCCAACAGCAACACCGGTAGCACAACCAGCTTCAACAACAAATGCAGTAGCTGCACATCCTGAAAATGCAGGGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTATGGAGTTAATGAATTCAGTACATACCGTGCGGGAGATCCAGGTGATCATGGTAAAGGTTTAGCAGTTGACTTTATTGTAGGTACTAATCAAGCACTTGGTAATAAAGTTGCACAGTACTCTACACAAAATATGGCAGCAAATAACATTTCATATGTTATCTGGCAACAAAAGTTTTACTCAAATACAAACAGTATTTATGGACCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCCAACCACTATGACCACGTTCACGTATCATTTAACAAATAATATAAAAAAGGAAGCTATTTGGCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGTTCTTATATAATTTTTATTA SEQ ID NO.39MNKKVLLTSTMAASLLSVASVQAQETDTTWTARTVSEVKADLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSHTATSMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVSPMKTYSSAPALKSKEVLAQEQAVSQAAANEQVSPAPVKSITSEVPAAKEEVKPTQTSVSQSTTVSPASVAAETPAPVAKVAPVRTVAAPRVASVKVVTPKVETGASPEHVSAPAVPVTTTSPATDSKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVNEFSTYRAGDPGDHGKGLAVDFIVGTNQALGNKVAQYSTQNMAANNISYVIWQQKFYSNTNSIYGPANTWNAMPDRGGVTANHYDHVHVSFNK

GBS 322 contains an N-terminal leader or signal sequence region which isindicated by the underlined sequence near the beginning of SEQ ID NO:39. In one embodiment, one or more amino acids from the leader or signalsequence region of GBS 322 are removed. An example of such a GBS 322fragment is set forth below as SEQ ID NO: 40. SEQ ID NO: 40DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSHTATSMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVSPMKTYSSAPALKSKEVLAQEQAVSQAAANEQVSPAPVKSITSENPAAKEEVKPTQTSVSQSTTVSPASVAAETPAPVAKVAPVRTVAAPRVASVKVVTPKVETGASPEHVSAPAVPVTTTSPATDSKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVNEFSTYRAGDPGDHGKGLAVDFIVGTNQALGNKVAQYSTQNMAANNISYVIWQQKFYSNTNSIYGPANTWNAMPDRGGVTANHYDHVHVSFNK

Additional preferred fragments of GBS 322 comprise the immunogenicepitopes identified in WO 03/068813, each of which are specificallyincorporated by reference herein.

There may be an upper limit to the number of GBS proteins which will bein the compositions of the invention. Preferably, the number of GBSproteins in a composition of the invention is less than 20, less than19, less than 18, less than 17, less than 16, less than 15, less than14, less than 13, less than 12, less than 11, less than 10, less than 9,less than 8, less than 7, less than 6, less than 5, less than 4, or lessthan 3. Still more preferably, the number of GBS proteins in acomposition of the invention is less than 6, less than 5, or less than4. Still more preferably, the number of GBS proteins in a composition ofthe invention is 3.

The GBS proteins and polynucleotides used in the invention arepreferably isolated, i.e., separate and discrete, from the wholeorganism with which the molecule is found in nature or, when thepolynucleotide or polypeptide is not found in nature, is sufficientlyfree of other biological macromolecules so that the polynucleotide orpolypeptide can be used for its intended purpose.

Group A Streptococcus Adhesin Island Sequences

The GAS AI polypeptides of the invention can, of course, be prepared byvarious means (e.g. recombinant expression, purification from GAS,chemical synthesis etc.) and in various forms (e.g. native, fusions,glycosylated, non-glycosylated etc.). They are preferably prepared insubstantially pure form (i.e. substantially free from otherstreptococcal or host cell proteins) or substantially isolated form.

The GAS AI proteins of the invention may include polypeptide sequenceshaving sequence identity to the identified GAS proteins. The degree ofsequence identity may vary depending on the amino acid sequence (a) inquestion, but is preferably greater than 50% (e.g. 60%, 65%, 70%, 75%,80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% ormore). Polypeptides having sequence identity include homologs,orthologs, allelic variants and functional mutants of the identified GBSproteins. Typically, 50% identity or more between two proteins isconsidered to be an indication of functional equivalence. Identitybetween proteins is preferably determined by the Smith-Waterman homologysearch algorithm as implemented in the MPSRCH program (OxfordMolecular), using an affinity gap search with parameters gap openpenalty=12 and gap extension penalty=1.

The GAS adhesin island polynucleotide sequences may includepolynucleotide sequences having sequence identity to the identified GASadhesin island polynucleotide sequences. The degree of sequence identitymay vary depending on the polynucleotide sequence in question, but ispreferably greater than 50% (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%,91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or more).

The GAS adhesin island polynucleotide sequences of the invention mayinclude polynucleotide fragments of the identified adhesin islandsequences. The length of the fragment may vary depending on thepolynucleotide sequence of the specific adhesin island sequence, but thefragment is preferably at least 10 consecutive polynucleotides, (e.g. atleast 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100,150, 200 or more).

The GAS adhesin island amino acid sequences of the invention may includepolypeptide fragments of the identified GAS proteins. The length of thefragment may vary depending on the amino acid sequence of the specificGAS antigen, but the fragment is preferably at least 7 consecutive aminoacids, (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80,90, 100, 150, 200 or more). Preferably the fragment comprises one ormore epitopes from the sequence. Other preferred fragments include (1)the N-terminal signal peptides of each identified GAS protein, (2) theidentified GAS protein without their N-terminal signal peptides, and (3)each identified GAS protein wherein up to 10 amino acid residues (e.g.1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) are deleted from theN-terminus and/or the C-terminus e.g. the N-terminal amino acid residuemay be deleted. Other fragments omit one or more domains of the protein(e.g. omission of a signal peptide, of a cytoplasmic domain, of atransmembrane domain, or of an extracellular domain).

GAS AI-1 Sequences

As discussed above, a GAS AI-1 sequence is present in an M6 strainisolate (MGAS10394). Examples of GAS AI-1 sequences from M6 strainisolate MGAS10394 are set forth below.

M6_Spy0156: Spy0156 is a rofA transcriptional regulator. An example ofan amino acid sequence for M6_Spy0156 is set forth in SEQ ID NO: 41. SEQID NO: 41 MIEKYLESSIESKCQLVVLEFKTSYLPITEVAEKTGLTFLQLNHYCEELNAFFPDSLSMTIQKRMISCQFTHPFKETYLYQLYASSNVLQLLAFLIKNGSHSRPLTDFARSHFLSNSSAYRMREALIPLLRNFELKLSKNKIVGEEYRIRYLIALLYSKFGIKVYDLTQQDKNTIHSFLSHSSTHLKTSPWLSESFSFYDILLALSWKRHQFSVTIPQTRIFQQLKKLFIYDSLKKSSRDIIETYCQLNFSAGDLDYLYLIYITANNSFASLQWTPEHIRQCCQLFEENDTFRLLLKPIITLLPNLKEQKPSLVKALMFESKSFLENLQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWLAKLPGKRYLNHKHFHLFCHYVEQILRNIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYIAR

M6_Spy0157: M6_Spy0157 is a fibronectin binding protein. It contains asortase substrate motif LPXTG (SEQ ID NO: 122), shown in italics in theamino acid sequence SEQ ID NO: 42. SEQ ID NO: 42MVSSYMFVRGEKMNNKIFLNKEASFLAHTKRKRRFAVTLVGVFFMLLACAGAIGFGQVAYAADEKTVPSHSSPNPEFPWYGYDAYGKEYPGYNIWTRYHDLRVNLNGSRSYQVYCFNIQSNYPSQKNSEIKNWFKKIEGNGKSFVDYAHTTKLGKEELEQRLLSLLYNAYPNDANGYMKGLEHLNAITVTQYAVWHYSDNSQYQFETLWESEAKEGKISRSQVTLMREALKKLIDPNLEATAVNKIPSGYRLNIFESENEAYQNLLSAEYVPDDPPKPGETSEHNPKTPELDGTPIPEDPKHPDDNLEPTLPPVMLDGEEVPEVPSESLEPALPPLMPELDGQEVPEKPSIDLPIEVPRYEFNNKDQSPLAGESGETEYITEVYGNQQNPVDIDKKLPNETGFSGNMVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQIETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQIETEDTKEPEVLMGGQSESVEFTKDTQTGMSGFSETATVVEDTRPKLVFHFDNNEPKVEENREKPTKNITPILPATGDIENVLAFLGILILSVLSIFSLLKNKQSNKKV

M6_Spy0157 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 180 LPATG (shown in italics in SEQ ID NO: 42, above).In some recombinant host cell systems, it may be preferable to removethis motif to facilitate secretion of a recombinant M6_Spy0157 proteinfrom the host cell. Alternatively, in other recombinant host cellsystems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in M6_Spy0157. The pilin motif sequenceis underlined in SEQ ID NO: 42, below. Conserved lysine (K) residues arealso marked in bold, at amino acid residues 277, 287, and 301. The pilinsequence, in particular the conserved lysine residues, are thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of M6_Spy0157 include at least one conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO: 42MVSSYMFVRGEKMNNKTFLNKEASFLAHTKRKRRFAVTLVGVFFMLLACAGAIGFGQVAYAADEKTVPSHSSPNPEFPWYGYDAYGKEYPGYNIWTRYHDLRVNLNGSRSYQVYCFNIQSNYPSQKNSFIKNWFKKIEGNGKSFVDYAHTTKLGKEELEQRLLSLLYNAYPNDANGYMKGLEHLNAITVTQYAVWHYSDNSQYQFETLWESEAKEGKISRSQVTLMREALKKLIDPNLEATAVNKIPSGYRLNIFESENEAYQNLLSAEYVPDDPPKPGETSEHNPKTPELDGTPTPEDP KHPDDNLEPTLPPVMLDGEEVPEVPSESLEPALPPLMPELDGQEVPEKPSTDLPIEVPRYEFNNKDQSPLAGESGETEYITEVYGNQQNPVDIDKKLPNETGFSGNMVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQIETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQIETEDTKEPEVLMGGQSESVEFTKDTQTGMSGESETATVVEDTRPKLVFHFDNNEPKVEENREKPTKNITPILPATGDIENVLAFLGILILSVLSIFSLLKNKQSNKKV

A repeated series of four E boxes containing a conserved glutamicresidue have been identified in M6_Spy0157. The E-box motifs areunderlined in SEQ ID NO: 42, below. The conserved glutamic acid (E)residues, at amino acid residues 415, 452, 489, and 526 are marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of M6_Spy0157. Preferred fragments of M6_Spy0157include at least one conserved glutamic acid residue. Preferably,fragments include at least one E box motif. SEQ ID NO: 42MVSSYMFVRGEKMNNKIFLNKEASFLAHTKRKRRFAVTLVGVFFMLLACAGAIGFGQVAYAADEKTVPSHSSPNPEFPWYGYDAYGKEYPGYNIWTRYHDLRVNLNGSRSYQVYCFNIQSNYPSQKNSFIKNWFKKIEGNGKSFVDYAHTTKLGKEELEQRLLSLLYNAYPNDANGYMKGLEHLNAITVTQYAVWHYSDNSQYQFETLWESEAKEGKISRSQVTLMREALKKLIDPNLEATAVNKIPSGYRLNIFESENEAYQNLLSAEYVPDDPPKPGETSEHNPKTPELDCTPIPEDPKHPDDNLEPTLPPVMLDGEEVPEVPSESLEPALPPLMPELDGQEVPEKPSIDLPIEVPRYEFNNKDQSPLAGESGETEYITEVYGNQQNPVDIDKKLPNETGFSGNMVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQIETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQIETEDTKEPEVLMGGQSESVEFTKDTQTGMSGFSETATVVEDTRPKLVFHFDNNEPKVEENREKPTKNITPILPATGDIENVLAFLGILILSVLSIFSLLKNKQSNKKV

M6_Spy0158: M6_Spy0158 is a reverse transcriptase. An example of Spy0158is shown in the amino acid sequence SEQ ID NO 43.

SEQ ID NO: 43

MSLRHQNKKGIRKEGWKSRPQSRWSDHCQLVAQKSVLKQAISKTVLAERGLFSCLDDYLERHALKVN

M6_Spy0159: M6_Spy0159 is a collagen adhesion protein. It contains asortase substrate motif LPXSG, shown in italics in the amino acidsequence SEQ ID NO: 44. SEQ ID NO: 44MYSRLKRELVIVINRKKKYKLIRLMVTVGLIFSQLVLPIRRLGLQMISTQTKVIPQEIVTQTETQGTQVVATKQKLESENSSLKVALKRESGFEHNATIDASLDTESQGDNSQRSVTQAIVTMALELRKQGLSIVDTKIVRIQSSTNQRNDITTTLTFKNGLSLEGASTEANDPNVRVGIVNPNDTVQTITPTIKQDADGKVKNLVFTGRLGKQVIIVSTTRLKEEQTISLDSYGELVIDGAVGLSQKDRPPYSKPITVNILKPKLSSIESSLDSKDFETVKTIDNLYTWDDQFYLLDFISKQYEVLKTDYQSAKDSTPQTRKILFGEYTVEPLVMNKGHNNTINIYIRSTRPLGLKPIGAAPALIQPRSFRSLTPRSTRMKRSAPVEKFEGELEHHKRIDYLGDNQNNPDTTIDDKEDEHDTSDLYRLYLDMTGKKNPLDILVVVDKSGSMQEGIGSVQRYRYYAQRWDDYYSQWVYHGTFDYSSYQGESFNRGQIHYRYRGIVSVSDGIRRDDAVKNSLLGVNGLLQRFVNINPENKLSVIGFQGSADYHAGKWYPDQSPRGGFYQPNLNNSRDAELLKGWSTNSLLDPNTLTALHNNGTNYHAALLKAKEILNEVKDDGRRKIMIFISDGVPTEYFGEDGYRSGNGSSNDRNNVTRSQEGSKLAIDEFKARYPNLSIYSLGVSKDINSDTASSSPVVLKYLSGEEHYYGITDTAELEKTLNKIVEDSKLSQLGISDSLSQYVDYYDKQPDVLVTRKSKVNDETEILYQKDQVQEAGKDIIDKVVFTPKTTSQPKGKVTLTFKSDYKVDDEYTYTLSFNVKASDEAYEKYKDNEGRYSEMGDSDTDYGTNQTSSGKGGLPSNSDASVNYMADGREQKLPYKHPVIQVKTVPITFTKVDADNNQKKLAGVEFELRKEDKKIVWEKGTTGSNGQLNFKYLQKGKTYYLYETKAKLGYTLPENPWEVAVANNGDIKVKHPIEGELKSKDGSYMIKNYKIYQLPSSGGRGSQIFIIVGSMTATVALLFYRRQHRKKQY

M6_Spy0159 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 181 LPSSG (shown in italics in SEQ ID NO: 44, above).In some recombinant host cell systems, it may be preferable to removethis motif to facilitate secretion of a recombinant M6_Spy0159 proteinfrom the host cell. Alternatively, in other recombinant host cellsystems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in M6_Spy0159. The pilin motif sequenceis underlined in SEQ ID NO: 44, below. Conserved lysine (K) residues arealso marked in bold, at amino acid residues 265 and 276. The pilinsequence, in particular the conserved lysine residues, are thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of M6_Spy0159 include at least one conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO: 44MYSRLKRELVIVINRKKKYKLIRLMVTVGLIFSQLVLPIRRLGLQMISTQTKVIPQEIVTQTETQGTQVVATKQKLESENSSLKVALKRESGFEHNATIDASLDTESQGDNSQRSVTQAIVTMALELRKQGLSIVDTKIVRIQSSTNQRNDITTTLTFKNGLSLEGASTEANDPNVRVGIVNPNDTVQTITPTIKQDADGKVKNLVFTGRLGKQVIIVSTTRLKEEQTISLDSYGELVIDGAVGLSQKDRPPYSKPITVNILKPKLSSIESSLDSK DFETVKTIDNLYTWDDQFYLLDFISKQYEVLKTDYQSAKDSTPQTRKILFGEYTVEPLVMNKGHNNTINIYIRSTRPLGLKPIGAAPALIQPRSFRSLTPRSTRMKRSAPVEKFEGELEHHKRIDYLGDNQNNPDTTIDDKEDEHDTSDLYRLYLDMTGKKNPLDILVVVDKSGSMQEGIGSVQRYRYYAQRWDDYYSQWVYHGTFDYSSYQGESFNRGQIHYRYRGIVSVSDGIRRDDAVKNSLLGVNGLLQRFVNINPENKLSVIGFQGSADYHAGKWYPDQSPRGGFYQPNLNNSRDAELLKGWSTNSLLDPNTLTALHNNGTNYHAALLKAKEILNEVKDDGRRKIMIFISDGVPTEYFGEDGYRSGNGSSNDRNNVTRSQEGSKLAIDEFKARYPNLSIYSLGVSKDINSDTASSSPVVLKYLSGEEHYYGITDTAELEKTLNKIVEDSKLSQLGISDSLSQYVDYYDKQPDVLVTRKSKVNDETEILYQKDQVQEAGKDIIDKVVFTPKTTSQPKGKVTLTFKSDYKVDDEYTYTLSFNVKASDEAYEKYKDNEGRYSEMGDSDTDYGTNQTSSGKGGLPSNSDASVNYMADGREQKLPYKHPVIQVKTVPITFTKVDADNNQKKLAGVEFELRKEDKKIVWEKGTTGSNGQLNFKYLQKGKTYYLYETKAKLGYTLPENPWEVAVANNGDIKVKHPIEGELKSKDGSYMIKNYKIYQLPSSGGRGSQIFIIVGSMTATVALLFYRRQHRKKQY

An E box containing a conserved glutamic residue has been identified inM6_Spy0159. The E-box motif is underlined in SEQ ID NO: 44, below. Theconserved glutamic acid (E), at amino acid residue 950, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of M6_Spy0159. Preferred fragments of M6_Spy0159include the conserved glutamic acid residue. Preferably, fragmentsinclude the E box motif. SEQ ID NO: 44MYSRLKRELVIVINRKKKYKLIRLMVTVGLIFSQLVLPIRRLGLQMISTQTKVIPQEIVTQTETQGTQVVATKQKLESENSSLKVALKRESGFEHNATIDASLDTESQGDNSQRSVTQAIVTMALELRKQGLSIVDTKIVRIQSSTNQRNDITTTLTFKNGLSLEGASTEANDPNVRVGIVNPNDTVQTITPTIKQDADGKVKNLVFTGRLGKQVIIVSTTRLKEEQTISLDSYGELVIDGAVGLSQKDRPPYSKPITVNILKPKLSSIESSLDSKDFETVKTIDNLYTWDDQFYLLDFISKQYEVLKTDYQSAKDSTPQTRKILFGEYTVEPLVMNKGHNNTINIYIRSTRPLGLKPIGAAPALIQPRSFRSLTPRSTRMKRSAPVEKFEGELEHHKRIDYLGDNQNNPDTTIDDKEDEHDTSDLYRLYLDMTGKKNPLDILVVVDKSGSMQEGIGSVQRYRYYAQRWDDYYSQWVYHGTFDYSSYQGESFNRGQIHYRYRGIVSVSDGIRRDDAVKNSLLGVNGLLQRFVNINPENKLSVIGFQGSADYHAGKWYPDQSPRGGFYQPNLNNSRDAELLKGWSTNSLLDPNTLTALHNNGTNYHAALLKAKEILNEVKDDGRRKIMIFISDGVPTEYFGEDGYRSGNGSSNDRNNVTRSQEGSKLAIDEFKARYPNLSIYSLGVSKDINSDTASSSPVVLKYLSGEEHYYGITDTAELEKTLNKIVEDSKLSQLGISDSLSQYVDYYDKQPDVLVTRKSKVNDETEILYQKDQVQEAGKDIIDKVVFTPKTTSQPKGKVTLTFKSDYKVDDEYTYTLSFNVKASDEAYEKYKDNEGRYSEMGDSDTDYGTNQTSSGKGGLPSNSDASVNYMADGREQKLPYKHPVIQVKTVPITFTKVDADNNQKKLAGVEFELRKEDKKIVWEKGTTGSNGQLNFKYLQKGKTYYLYETKAKLGYTLPENPWEVAVANNGDIKVKHPIEGELKSKDGSYMIKNYKIYQLPSSGGRGSQIFIIVGSMTATVALLFYRRQHRKKQY

M6_Spy0160: M6_Spy0160 is a fimbrial structural subunit. It contains asortase substrate motif LPXTG (SEQ ID NO: 122), shown in italics inamino acid sequence SEQ ID NO: 45. SEQ ID NO: 45MTNRRETVREKILITAKKLMLACLATLAVVGLGMTRVSALSKDDTAQLKITNIEGGPTVTLYKIGEGVYNTNGDSFINFKYAEGVSLTETGPTSQEITTIANGINTGKIKPFSTENVSISNGTATYNARGASVYIALLTGATDGRTYNPILLAASYNGEGNLVTKNIDSKSNYLYGQTSVAKSSLPSITKKVTGTIDDVNKKTTSLGSVLSYSLTFELPSYTKEAVNKTVYVSDNMSEGLTFNFNSLTVEWKGKMANITEDGSVMVENTKIGIAKEVNNGFNLSFIYDSLESISPNISYKAVVNNKAIVGEEGNPNKAEFFYSNNPTKGNTYDNLDKKPDKGNGITSKEDSKIVYTYQIAFRKVDSVSKTPLTGAIFGVYDTSNKLIDIVTTNKNGYAISTQVSSGKYKIKELKAPKGYSLNTETYEITANWVTATVKTSANSKSTTYTSDKNKATDNSEQVGWLKNGIFYSIDSRPTGNDVKEAYIESTKALTDGTTFSKSNEGSGTVLLETDIPNTKLGELPSTGSIGTYLEKAIGSAAMIGAIGIYI VKRRKA

M6_Spy0160 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 131 LPSTG (shown in italics in SEQ ID NO: 45, above).In some recombinant host cell systems, it may be preferable to removethis motif to facilitate secretion of a recombinant M6_Spy0160 proteinfrom the host cell. Alternatively, in other recombinant host cellsystems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

An E box containing a conserved glutamic residue has been identified inM6_Spy0160. The E-box motif is underlined in SEQ ID NO: 45, below. Theconserved glutamic acid (E), at amino acid residue 412, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of M6_Spy0160. Preferred fragments of M6_Spy0160include the conserved glutamic acid residue. Preferably, fragmentsinclude the E box motif. SEQ ID NO: 45MTNRRETVREKILITAKKLMLACLATLAVVGLGMTRVSALSKDDTAQLKITNIEGGPTVTLYKIGEGVYNTNGDSFINFKYAEGVSLTETGPTSQEITTIANGINTGKIKPFSTENVSISNGTATYNARGASVYIALLTGATDGRTYNPILLAASYNGEGNLVTKNIDSKSNYLYGQTSVAKSSLPSITKKVTGTIDDVNKKTTSLGSVLSYSLTFELPSYTKEAVNKTVYVSDNMSEGLTFNFNSLTVEWKGKMANITEDGSVMVENTKIGIAKEVNNGFNLSFIYDSLESISPNISYKAVVNNKAIVGEEGNPNKAEFFYSNNPTKGNTYDNLDKKPDKGNGITSKEDSKIVYTYQIAFRKVDSVSKTPLTGAIFGVYDTSNKLIDIVTTNKNGYAISTQVSSGKYKIKELKAPKGYSLNTETYEITANWVTATVKTSANSKSTTYTSDKNKATDNSEQVGWLKNGIFYSIDSRPTGNDVKEAYIESTKALTDGTTFSKSNEGSGTVLLETDIPNTKLGELPSTGSIGTYLFKAIGSAAMIGAIGIYI VKRRKA

M6_Spy0161 is a srtB type sortase. An example of an amino acid sequenceof M6_Spy-161 is shown in SEQ ID NO: 46. SEQ ID NO: 46MTERLKNLGILLLFLLGTAIFLYPTLSSQWNAYRDRQLLSTYHKQVIQKKPSEMEEVWQKAKAYNARLGIQPVPDAFSFRDGIHDKNYESLLQIENNDIMGYVEVPSIKVTLPIYHYTTDEVLTKGAGHLFGSALPVGGDGTHTVISAHRGLPSAEMFTNLNLVKKGDTFYFRVLNKVLAYKVDQILIVEPDQATSLSGVMGKDYATLVTCTPYGVNTKRLLVRGHRIAYHYKKYQQAKKAMKLVDKSRMWAEVVCAAFGVVIAIILVFMYSRVSAKKSK

As discussed above, applicants have also determined the nucleotide andencoded amino acid sequence of fimbrial structural subunits in severalother GAS AI-1 strains of bacteria. Examples of sequences of thesefimbrial structural subunits are set forth below.

M6 strain isolate CDC SS 410 is a GAS AI-1 strain of bacteria. CDC SS410_fimbrial is thought to be a fimbrial structural subunit of M6 strainisolate CDC SS 410. An example of a nucleotide sequence encoding the CDCSS 410_fimbrial protein (SEQ ID NO: 267) and a CDC SS 410_fimbrialprotein amino acid sequence (SEQ ID NO: 268) are set forth below. SEQ IDNO: 267 aaagatgatactgcacaactaaagataacaaatattgaaggtgggccaacagtaacactttataaaataggagaaggtgtttacaacactaatggtgattcttttattaactttaaatatgctgagggggtttctttaactgaaacaggacctacatcacaagaaattactactattgcaaatggtattaatacgggtaaaataaagccttttagtactgaaaacgttagtatttctaatggaacagcaacttataatgcgagaggtgcatctgtttatattgcattattaacaggtgcgacagatggccgtacctacaatcctattttattagctgcatcttataatggtgagggaaatttagttactaaaaatattgattccaaatctaattatttatatggacaaacaagtgttgcaaaatcatcattaccatctattacaaagaaagtaaccgggacaatagatgacgtgaataaaaagactacctcgttaggaagtgtattgtcttattcgctgacatttgaattaccaagttataccaaagaagcagtcaataaaacagtatatgtttctgataatatgtcggaaggtcttacttttaactttaatagtcttacagtagaatggaaaggtaagatggctaatattactgaagatggttcagtaatggtagaaaatacaaaaatcggaatagctaaggaggttaataacggttttaatttaagttttatttatgatagtttagaatctatatcaccaaatataagttataaagctgttgtaaacaataaagctattgttggtgaagagggtaatcctaataaagctgaattcttctattcaaataatccaacaaaaggtaatacatacgataatttagataagaagcctgataaagggaatggtattacatccaaagaagattctaaaattgtttatacttatcaaatagcgtttagaaaagttgatagtgttagtaagaccccacttattggtgcaatttttggagtttatgatactagtaataaattaattgatattgttacaaccaataaaaatggatatgctatttcaacacaagtatcttcaggaaaatataaaattaaggaattaaaagctcctaaaggttattcattgaatacagaaacttatgaaattacggcaaattgggtaactgctacagtcaagacaagtgctaattcaaaaagtactacttatacatctgataaaaataaggcgacagataattcagagcaagtaggatggttaaaaaatggtatattctattctatagatagtagacctacaggaaatgatgttaaagaggcttatattgaatctactaaggctttaactgatggaacaactttctcaaaatcgaatgaaggttcaggtacagtattattagaaactgacatccctaacaccaagctaggtgaactc SEQ ID NO: 268KDDTAQLKTTNIEGGPTVTLYKIGEGVYNTNGDSFINFKYAEGVSLTETGPTSQEITTIANGINTGKIKPFSTENVSISNGTATYNARGASVYIALLTGATDGRTYNPILLAASYNGEGNLVTKNIDSKSNYLYGQTSVAKSSLPSITKLVTGTIDDVNKKTTSLGSVLSYSLTFELPSYTKEAVNKTVYVSDNMSEGLTFNFNSLTVEWKKGMANITEDGSVMVENTKIGIAKEVNNGFNLSFIYDSLESISPNISYKAVVNNKAIVGEEGNPNKAEFFYSNNPTKGNTYDNLDKKPDKGNGITSKEDSKIVYTYQIAFRKVDSVSKTPLIGAIFGVYDTSNKLIDIVTTNKNGYAISTQVSSGKYKIKELKAPKGYSLNTETYEITANWVTATVKTSANSKSTTYTSDKNKATDNSEQVGWLKNGIFYSIDSRPTGNDVKEAYTESTKALTDGTTFSKSNEGSGTVLLETDIPNTKLGEL

M6 strain isolate ISS 3650 is a GAS AI-1 strain of bacteria.ISS3650_fimbrial is thought to be a fimbrial structural subunit of M6strain isolate ISS 3650. An example of a nucleotide sequence encodingthe ISS3650_fimbrial protein (SEQ ID NO: 269) and an ISS3650_fimbrialprotein amino acid sequence (SEQ ID NO: 270) are set forth below. SEQ IDNO: 269 gaatggaaaggtaagatggctaatattactgaagatggttcagtaatggtagaaaatacaaaaatcggaatagctaaggaggttaataacggttttaatttaagttttatttatgatagtttagaatctatatcaccaaatataagttataaagctgttgtaaacaataaagctattgttggtgaagagggtaatcctaataaagctgaattcttctattcaaataatccaacaaaaggtaatacatacgataatttagataagaagcctgataaagggaatggtattacatccaaagaagattctaaaattgtttatacttatcaaatagcgtttagaaaagttgatagtgttagtaagaccccacttattggtgcaatttttggagtttatgatactagtaataaattaattgatattgttacaaccaataaaaatggatatgctatttcaacacaagtatcttcaggaaaatataaaattaaggaattaaaagctcctaaaggttattcattgaatacagaaacttatgaaattacggcaaattgggtaactgctacagtcaagacaagtgctaattcaaaaagtactacttatacatctgataaaaataaggcgacagataattcagagcaagtaggatggttaaaaaatggtatattctattctatagatagtagacctacaggaaatgatgttaaagaggcttatattgaatctactaaggctttaactgatggaacaactttctcaaaatcgaatgaaggttcaggtacagtattattagaaactgacatcc SEQ ID NO: 270EWKGKMANITEDGSVMVENTKIGIAKEVNNGFNLSFIYDSLESISPNISYKAVVNNKAIVGEEGNPNKAEFFYSNNPTKGNTYDNLDKKPDKGNGITSKEDSDIVYTYQIAFRKVDSVSKTPLIGAIFGVYDTSNKLIDIVTTNKNGYAISTQVSSGKYKIKELKAPKGYSLNTETYEITANWVTATVKTSANSKSTTYTSDKNKATDNSEQVGWLKNGIFYSIDSRPTGNDVKEAYIESTKALTDGTTF SKSNEGSGTVLLETDI

M23 strain isolate DSM2071 is a GAS AI-1 strain of bacteria.DSM2071_fimbrial is thought to be a fimbrial structural subunit of M23strain DSM2071. An example of a nucleotide sequence encoding theDSM2071_fimbrial protein (SEQ ID NO: 251) and a DSM2071_fimbrial proteinamino acid sequence (SEQ ID NO: 252) are set forth below. SEQ ID NO: 251atgagagagaaaatattaatagcagcaaaaaaactaatgctagcttgtttagctatcttagctgtagtagggcttggaatgacaagagtatcagctttatcaaaagatgataaggcggagttgaagataacaaatatcgaaggtaaaccgaccgtgacactgtataaaattggtgatggaaaatacagtgagcgaggggattcttttattggatttgagttaaagcaaggtgtggagctaaataaggcaaaacctacatctcaagaaataaataaaatcgctaatggtattaataaaggtagtgttaaggctgaagtagttaatataaaagaacatgctagtacaacttatagttatacaacaactggtgcaggtatttacttggctatattgactggagctactgatggacgtgcctataatcctatcttactgacagcttcttacaatgaggaaaatccacttaagggagggcagattgacgcaactagtcattatctttttggagaagaagcagttgctaaatctagccaaccaacaattagcaagtcaattacaaaatccacaaaagatggtgataaagatacagcatctgtaggtgaaaaagttgattacaaattaactgttcagttaccaagttattcgaaagatgctatcaataaaacggtgtttatcactgacaaattgtctcagggacttactttccttccaaaaagtttaaagattatctggaatggtcaaacgttaacaaaggtgaatgaagaatttaaagctggagataaggtaattgctcaacttaaggttgaaaataatggatttaatctgaactttaattatgataaccttgataatcatgccccagaagttaactatagtgctctactaaatgaaaacgcagttgttggtaaaggtggtaatgacaataatgtagactattactattcaaataatccgaataaaggagagacccataaaacaactgagaagcctaaagagggtgaaggtactggtatcactaaaaagacggataaaaaaaccgtctacacctatcgtgtagcctttaagaaaacaggcaaagatcatgccccactagctggtgctgttttcggtatctattcagataaggaagcgaaacaattagtcgatattgttgtgacaaatgcacagggttatgcagcatcaagcgaagttgggaaagggacttattacattaaagaaattaaatcccctaagggttactctttaaatacaaatatttatgaagtggaaacttcatgggaaaaagctacaacgacttctacaactaatcgtttagagacaatttatacaacagatgataatcaaaagtctccaggaactaatacagttggttggttggaagatggtgtcttttacaaagaaaatccaggtggtgatgctaaacttgcctatatcaaacaatcaacagaggagacttctacaactatagaagtcaaagaaaatcaagctgaaggttcaggtacggtattattagaaactgaaattcctaacaccaaattaggtgaattaccttcgacaggtagcattggtacttacctctttaaagctattggttcggctgctatgatcggtgcaattggtatttatattgttaaacgtcgtaaagcttaa SEQ ID NO: 252MREKILIAAKKLMLACLATLAVVGLGMTRVSALSKDDKAELKITNIEGKPTVTLYKIGDGKYSERGDSFIGFELKQGVELNKAKPTSQEINKIANGINKGSVKAEVVNIKEHASTTYSYTTTGAGIYLAILTGATDGRAYNPILLTASYNEENPLKGGQIDATSHYLFGEEAVAKSSQPTISKSITKSTKDGDKDTASVGEKVDYKLTVQLPSYSKDAINKTVFITDKLSQGLTFLPKSLKIIWNGQTLTKVNEEFKAGDKVIAQLKVENNGFNLNFNYDNLDNHAPEVNYSALLNENAVVGKGGNDNNVDYYYSNNPNKGETHKTTEKPKEGEGTGITKKTDKKTVYTYRVAFKKTGKDHAPLAGAVFGIYSDKEAKQLVDIVVTNAQGYAASSEVGKGTYYIKEIKSPKGYSLNTNIYEVETSWEKATTTSTTNRLETIYTTDDNQKSPGTNTVGWLEDGVFYKENPGGDAKLAYIKQSTEETSTTIEVKENQAEGSGTVLLETEIPNTKLGELPSTGSIGTYLFKAIGSAAMIGAIGIYIVKRRKA

GAS AI-2 sequences

As discussed above, a GAS AI-2 sequence is present in an M1 strainisolate (SF370). Examples of GAS AI-2 sequences from M1 strain isolateSF370 are set forth below.

Spy0124 is a rofA transcriptional regulator. An example of an amino acidsequence for

Spy0124 is set forth in SEQ ID NO:47. SEQ ID NO: 47MIEKYLESSIESKGQLIVLFFKTSYLPITEVAEKTGLTFLQLNHYCEELNAFFPGSLSMTIQKRMISCQFTHPFKETYLYQLYASSNVLQLLAFLIKNGSHSRPLTDPARSHFLSNSSAYRMREALIPLLRNFELKLSKNKIVGEEYRIRYLIALLYSKFGIKVYDLTQQDKNTIHSFLSHSSTHLKTSPWLSESESFYDILLALSWKRHQFSVTIPQTRIFQQLKKLFVYDSLKKSSHDIIETYGQLNFSAGDLDYLYLIYITANNSFASLQWTPEHIRQYCQLFEENDTFRLLLNPIITLLPNLKEQKASLVKALMFFSKSFLFNLQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRDLNHKHFHLFCHYVEQSLRNIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITHSQLIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKEQADLTKQLT

GAS 015 is also referred to as Cpa. It contains a sortase substratemotif VVXTG (SEQ ID NO: 135), shown in italics in SEQ ID NO: 48. SEQ IDNO: 48 LRGEKMKKTRFPNKLNTLNTQRVLSKNSKRFTVTLVGVFLMIEALVTSMVGAKTVFGLVESSTPNAINPDSSSEYRWYGYESYVRGHPYYKQFRVAHDLRVNLEGSRSYQVYCFNLKKAFPLGSDSSVKKWYKKHDGISTKFEDYAMSPRITGDELNQKLRAVMYNGHPQNANGIMEGLEPLNAIRVTQEAVWYYSDNAPISNPDESFKRESESNLVSTSQLSLMRQALKQLIDPNLATKMPKQVPDDFQLSIFESEDKGDKYNKGYQNLLSGGLVPTKPPTPGDPPMPPNQPQTTSVLIRKYAIGDYSKLLEGATLQLTGDNVNSFQARVFSSNDIGERIELSDGTYTLTELNSPAGYSIAEPITFKVEAGKVYTIIDGKQIENPNKEIVEPYSVEAYNDFEEFSVLTTQNYAKFYYAKNKNGSSQVVYCFNADLKSPPDSEDGGKTMTPDFTTGEVKYTHIAGRDLFKYTVKPRDTDPDTFLKHIKKVIEKGYREKGQAIEYSGLTETQLRAATQLATYYFTDSAELDKDKLKDYHGFGDMNDSTLAVAKILVEYAQDSNPPQLTDLDFFIPNNNKYQSLIGTQWHPEDLVDTIRMEDKKEVIPVTHNLTLRKTVTGLAGDRTKDFHFEIELKNNKQELLSQTVKTDKTNLEFKDGKATINLKHGESLTLQGLPEGYSYLVKETDSEGYKVKVNSQEVANATVSKTGITSDETLAFENNKEPVVPTGVDQKINGYLALIVIAGISLGI WGIHTIRIRKHD

GAS 015 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 182 VVPTG (shown in italics in SEQ ID NO: 48, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant GAS 015 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in GAS 015. The pilin motif sequence isunderlined in SEQ ID NO: 48, below. Conserved lysine (K) residues arealso marked in bold, at amino acid residue 243. The pilin sequence, inparticular the conserved lysine residues, are thought to be importantfor the formation of oligomeric, pilus-like structures. Preferredfragments of GAS 015 include the conserved lysine residue. Preferably,fragments include the pilin sequence. SEQ ID NO: 48LRGEKMKKTRFPNKLNTLNTQRVLSKNSKRFTVTLVGVFLMIFALVTSMVGAKTVFGLVESSTPNAINPDSSSEYRWYGYESYVRGHPYYKQFRVAHDLRVNLEGSRSYQVYCFNLKKAFPLGSDSSVKKWYKKHDGISTKFEDYANSPRITGDELNQKLRAVMYNGHPQNANGIMEGLEPLNAIRVTQEAVWYYSDNAPISNPDESFKRESESNLVSTSQLSLMRQALKQLIDPNLATKMPKQVPDDFQLSIFESEDKGDKYNKGYQNLLSGGLVPTKPPTPGDPPMPPNQPQTTSVLIRKYAIGDYSKLLEGATLQLTGDNVNSFQARVFSSNDIGERIELSDGTYTLTELNSPAGYSIAEPITFKVEAGKVYTIIDGKQIENPNKETVEPYSVEAYNDFEEFSVLTTQNYAKFYYAKNKNGSSQVVYCFNADLKSPPDSEDGGKTMTPDFTTGEVKYTHIAGRDLFKYTVKPRDTDPDTFLKHIKKVIEKGYREKGQAIEYSGLTETQLRAATQLAIYYFTDSAELDKDKLKDYHGFGDMNDSTLAVAKILVEYAQDSNPPQLTDLDFFIPNNNKYQSLIGTQWHPEDLVDTIRMEDKKEVIPVTHNLTLRKTVTGLAGDRTKDFHPEIELKNNKQELLSQTVKTDKTNLEFKDGKATINLKHGESLTLQGLPEGYSYLVKETDSEGYKVKVNSQEVANATVSKTGTTSDETLAFENNKEPVVPTGVDQKINGYLALIVIAGISLGI WGIHTIRIRKHD

An E box containing a conserved glutamic residue has been identified inGAS 015. The E-box motif is underlined in SEQ ID NO: 48, below. Theconserved glutamic acid (E), at amino acid residue 352, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of GAS 015. Preferred fragments of GAS 015 includethe conserved glutamic acid residue. Preferably, fragments include the Ebox motif. SEQ ID NO: 48LRGEKMKKTRFPNKLNTLNTQRVLSKNSKRFTVTLVGVFLMIFALVTSMVGAKTVFGLVESSTPNAINPDSSSEYRWYGYESYVRGHPYYDQFRVAHDLRVNLEGSRSYQVYCFNLKKAFPLGSDSSVKKWYKKHDGISTKFEDYAMSPRITGDELNQKLRAVMYNGHPQNANGIMEGLEPLNAIRVTQEAVWYYSDNAPISNPDESFKRESESNLVSTSQLSLMRQALKQLIDPNLATKMPKQVPDDFQLSIFESEDKGDKYNKGYQNLLSGGLVPTKPPTPGDPPMPPNQPQTTSVLIRKYAIGDYSKLLEGATLQLTGDNVNSFQARVFSSNDIGERIELSDGTYTLTELNSPAGYSIAEPITFKVEAGKVYTIIDGKQIENPNKEIVEPYSVEAYNDFEEFSVLTTQNYAKFYYAKNKNGSSQVVYCPNADLKSPPDSEDGGKTMTPDFTTGEVKYTHIAGRDLFKYTVKPRDTDPDTFLKHIKKVIEKGYREKGQAIEYSGLTETQLRAATQLAIYYFTDSAELDKDKLKDYHGFGDMNDSTLAVAKILVEYAQDSNPPQLTDLDFFIPNNNKYQSLIGTQWHPEDLVDIIRMEDKKEVIPVTHNLTLRKTVTGLAGDRTKDFHFEIELKNNKQELLSQTVKTDKTNLEFKDGKATINLKHGESLTLQGLPEGYSYLVKETDSEGYKVKVNSQEVANATVSKTGITSDETLAFENNKEPVVPTGVDQKINGYLALIVIAGISLGI WGIHTIRIRKHD

Spy0127 is a LepA putative signal peptidase. An example of an amino acidsequence for Spy0127 is set forth in SEQ ID NO: 49. SEQ ID NO: 49MIIKRNDMAPSVKAGDAILFYRLSQTYKVEEAVVYEDSKTSITKVGRIIAQAGDEVDLTEQGELKINGHIQNEGLTFIKSREANYPYRIADNSYLTLNDYYSQESENYLQDAIAKDAIKGTINTLIRLRNH

Spy0128 is thought to be a fibrial protein. It contains a sortasesubstrate motif EVXTG (SEQ ID NO: 136) shown in italics in SEQ ID NO:50. SEQ ID NO: 50 MKLRHLLLTGAALTSFAATTVHGETVVNGAKLTVTKNLDLVNSNALIPNTDFTFKIEPDTTVNEDGNKFKGVALNTPMTKVTYTNSDKGGSNTKTAEFDFSEVTFEKPGVYYYKVTEEKIDKVPGVSYDTTSYTVQVHVLWNEEQQKPVATYIVGYKEGSKVPIQFKNSLDSTTLTVKKKVSGTGGDRSKDFNFGLTLKANQYYKASEKVMIEKTTKGGQAPVQTEASIDQLYHFTLKDGESIKVTNLPVGVDYVVTEDDYKSEKYTTNVEVSPQDGAVKNIAGNSTEQETSTDKDMTITFTNKKDFEVPTGVAMTVAPYIALGIVAVGGALYFVKKKNA

Spy0128 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 183 EVPTG (shown in italics in SEQ ID NO: 50, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant Spy0128 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

Two E boxes containing a conserved glutamic residue have been identifiedin Spy0128. The E-box motifs are underlined in SEQ ID NO: 50, below. Theconserved glutamic acid (E) residues, at amino acid residues 271 and290, are marked in bold. The E box motifs, in particular the conservedglutamic acid residues, are thought to be important for the formation ofoligomeric pilus-like structures of Spy0128. Preferred fragments ofSpy0128 include at least one conserved glutamic acid residue.Preferably, fragments include at least one E box motif. SEQ ID NO: 50MKLRHLLLTGAALTSFAATTVHGETVVNGAKLTVTKNLDLVNSNALIPNTDFTFKIEPDTTVNEDGNKFKGVALNTPMTKVTYTNSDKGGSNTKTAEFDFSEVTPEKPGVYYYKVTEEKIDKVPGVSYDTTSYTVQVHVLWNEEQQKPVATYIVGYKEGSKVPIQFKNSLDSTTLTVKKKVSGTGGDRSKDFNFGLTLKANQYYKASEKVMIEKTTKGGQAPVQTEASTDQLYHFTLKDGESIKVTNLPVGVDYVVTEDDYKSEKYTTNVEVSPQDGAVKNIAGNSTEQETSTDKDMTITFTNKKDFEVPTGVAMTVAPYIALGIVAVGGALYFVKKKNA

Spy0129 is a srtC1 type sortase. An example of an amino acid sequencefor Spy0129 is set forth in SEQ ID NO: 51. SEQ ID NO: 51MIVRLIKLLDKLINVIVLCFFFLCLLIAALGIYDALTVYQGANATNYQQYKKKGVQFDDLLAINSDVMAWLTVKGTHIDYPIVQGENNLEYINKSVEGEYSLSGSVFLDYRNKVTFEDKYSLIYAHHMAGNVMFGELPNFRKKSFFNKHKEESIETKTKQKLKINIFACIQTDAFDSLLFNPIDVDISSKNEFLNHIKQKSVQYREILTTNESRFVALSTCEDMTTDGRIIVIGQIE″

Spy0130 is referred to as a hypothetical protein. It contains a sortasesubstrate motif LPXTG (SEQ ID NO: 122), shown in italics in SEQ ID NO:52. SEQ ID NO: 52 MKKSILRILAIGYLLMSFCLLDSVEAENLTASINIEVINQVDVATNKQSSDIDETFMPVIEALDKESPLPNSVTTSVKGNGKTSFEQLTPSEVGQYHYKIHQLLGKNSQYHYDETVYEVVIYVLYNEQSGALETNLVSNKLGETEKSELIFKQEYSEKTPEPHQPDTTEKEKPQKKRNGILPSTGEMVSYVSALGIVLVA TITLYSIYKKLKTSK

Spy0130 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 131 LPSTG (shown in italics in SEQ ID NO: 52, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant Spy0130 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

Two E boxes containing conserved glutamic residues have been identifiedin Spy0130. The E-box motifs are underlined in SEQ ID NO: 52, below. Theconserved glutamic acid (E) residues, at amino acid residues 118 and148, are marked in bold. The E box motifs, in particular the conservedglutamic acid residues, are thought to be important for the formation ofoligomeric pilus-like structures of Spy0130. Preferred fragments ofSpy0130 include at least one conserved glutamic acid residue.Preferably, fragments include at least one E box motif. SEQ ID NO:52MKKSILRILAIGYLLMSFCLLDSVEAENLTASINIEVINQVDVATNKQSSDTDETFMFVIEALDKESPLPNSVTTSVKGNGKTSFEQLTFSEVGQYHYKIHQLLGKNSQYHYDETVYEVVIYVLYNEQSGALETNLVSNKLGETEKSELIFKQEYSEKTPEPHQPDTTEKEKPQKKRNGILPSTGEMVSYVSALGIVLVA TITLYSIYKKLKTSK

Spy0131 is referred to as a conserved hypothetical protein. An exampleof an amino acid sequence of Spy0131 is set forth in SEQ ID NO: 53 SEQID NO: 53 MTRTNYQKKRMTCPVETEDITYRRKKIKGRRQAILAQFEPELVHHELIGDSCTCPDCHGTLTEIGSVVQRQELVFIPAQLKRINHVQHAYKCQTCSDNSLSDKIIKAPVPKAPLAHSLGSASIIAHTVHQKFTLKVPNYRQEEDWNKLGLSISRKEIANWHIKSSQYYFEPLYDLLRDILLSQEVIHADETSYRVLESDTQLTYYWTFLSGKHEKKGITLYHHDKRRSGLVTQEVLGDYSGYVHCDMHGAYRQLEHAKLVGCWAHVRRKFFEATPKQADKTSLGRKGLVYCDKLFALEAEWCELPPQERLVKRKEILTPLMTTFFDWCREQVVLSGSKLGLAIAYSLKHERTFRTVLEDGHIVLSNNMAERAIKSLVMGRKNWLFSQSFEGAKAAAIIMSLLETAKRHGLNSEKYISYLLDRLPNEETLAKREVLEAYLPWAKKVQTNCQ

Spy0133 is referred to as a conserved hypothetical protein. An exampleof an amino acid sequence of Spy0133 is set forth in SEQ ID NO: 54. SEQID NO: 54 MTIRLNDLGQVYLVCGKTDMRQGIDSLAYLVKSQHELDLFSGAVYLFCGGRTRDRFKALYWDGQGFWLLYKRFENGKLAWPRNRDEVKCLTAVQVDWLMK GFFISPNIKISKSHDFY

Spy0135 is a SrtB type sortase. It is also referred to as a putativefibria-associated protein. An example of an amino acid sequence ofSpy0135 is set forth in SEQ ID NO: 55. SEQ ID NO: 55MECYRDRQLLSTYHKQVTQKKPSEMEEVWQKAKAYNARLGIQPVPDAFSFRDGIHDKNYESLLQIENNDIMGYVEVPSIKVTLPIYHYTTDEVLTKGAGHLFGSALPVGGDGTHTVISAHRGLPSAEMETNLNLVKKGDTFYFRVLNKVLAYKVDQILTVEPDQVTSLSGVMGKDYATLVTCTPYGVNTKRLLVRGHRIAYHYKKYQQAKKAMKLVDKSRMWAEVVCAAFGVVIAIILVFMYSRVSAKKS K

GAS AI-3 Sequences

As discussed above, a GAS AI-3 sequence is present in a M3, M18 and M5strain isolates. Examples of GAS AI-3 sequences from M3 strain isolateMGAS315 are set forth below.

SpyM30097 is as a negative transcriptional regulator (Nra). An exampleof an amino acid sequence of SpyM30097 is set forth in SEQ ID NO: 56.SEQ ID NO: 56 MPYVKKKKDSFLVETYLEQSIRDKSELVLLLFKSPTIIFSHVAKQTGLTAVQLKYYCKELDDFFGNNLDTTIKKGKIICCFVKPVKEFYLHQLYDTSTILKLLVFFIKNGTSSQPLIKFSKKYFLSSSSAYRLRESLIKLLREFGLRVSKNTIVGEEYRIRYLIAMLYSKFGIVIYPLDHLDNQIIYRFLSQSATNLRTSPWLEEPFSFYNMLLALSWKRHQFAVSIPQTRIFRQLKKLFIYDCLTRSSRQVIENAFSLTFSQGDLEYLFLIYITTNNSFASLQWTPQHIETCCHIFEKNDTFRLLLEPILKRLPQLNHSKQDLIKALMYFSKSFLFNLQHFVIEIPSFSLPTYTGNSNLYKALKNIVNQWLAQLPGKRHLNEKHLQLFCSHIEQILKNKQPALTVVLISSNFINAKLLTDTIPRYPSDKGIHFYSFYLLRDDIYQIPSLKPDLVITHSRLIPFVKNDLVKGVTVAEFSFDNPDYSIASIQNLTYQLKDK KYQDFLNEQLQ

SpyM30098 is thought to be a collagen binding protein (Cpb). It containsa sortase substrate motif VPXTG (SEQ ID NO: 137) shown in italics in SEQID NO: 57. SEQ ID NO: 57MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEQSVPNKQSSVQDYPWYGYDSYSKGYPDYSPLKTYHNLKVNLDGSKEYQAYCFNLTKHFPSKSDSVRSQWYKKLEGTNENFIKLADKPRIEDGQLQQNILRILYNGYPNDRNGIMKGIDPLNAILVTQNAIWYYTDSSYISDTSKAFQQEETDLKLDSQQLQLMRNALKRLINPKEVESLPNQVPANYQLSIFQSSDKTFQNLLSAEYVPDTPPKPGEEPPAKTEKTSVIIRKYAEGDYSKLLEGATLKLAQIEGSGFQEKIFDSNKSGEKVELPNGTYVLSELKPPQGYGVATPITFKVAAEKVLIKNKEGQFVENQNKEIAEPYSVTAFNDEEEIGYLSDFNNYGKFYYAKNTNGTNQVVYCFNADLHSPPDSYDHGANIDPDVSESKEIKYTHVSGYDLYKYAATPRDKDADFFLKHIKKILDKGYKKKGDTYKTLTEAQFRAATQLAIYYYTDSADLTTLKTYNDNKGYHGFDKLDDATLAVVHELITYAEDVTLPMTQNLDEFVPNSSRYQALIGTQYHPNELIDVISMEDKQAPIIPITHKLTISKTVTGTIADKKKEFNFEIHLKSSDGQAISGTYPTNSGELTVTDGKATFTLKDGESLIVEGLPSGYSYEITETGASDYEVSVNGKNAPDGKATKASVKEDETVAFENRKDLVPPTGLTTDGAIYLWLLLLVPFGLLVWLFGRKGTKK

SpyM30098 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 184 VPPTG (shown in italics in SEQ ID NO: 57, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant SpyM30098 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in SpyM30098. The pilin motif sequenceis underlined in SEQ ID NO: 57, below. Conserved lysine (K) residues arealso marked in bold, at amino acid residues 262 and 270. The pilinsequence, in particular the conserved lysine residues, are thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of SpyM30098 include at least one conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO: 57MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEQSVPNKQSSVQDYPWYGYDSYSKGYPDYSPLKTYHNLKVNLDGSKEYQAYCFNLTKHFPSKSDSVRSQWYKKLEGTNENFIKLADKPRIEDGQLQQNILRILYNGYPNDRNGIMKGTDPLNATLVTQNAIWYYTDSSYISDTSKAFQQEETDLKLDSQQLQLMRNALKRLINPKEVESLPNQVPANYQLSIFQSSDKTFQNL LSAEYVPDTPPKPGEEPPAKTEKTSVIIRKYAEGDYSKLLEGATLKLAQIEGSGFQEKIFDSNKSGEKVELPNGTYVLSELKPPQGYGVATPITFKVAAEKVLIKNKEGQFVENQNKEIAEPYSVTAFNDFEEIGYLSDFNNYGKFYYAKNTNGTNQVVYCFNADLHSPPDSYDHGANIDPDVSESKEIKYTHVSGYDLYKYAATPRDKDADFFLKHIKKILDKGYKKKGDTYKTLTEAQFRAATQLAIYYYTDSADLTTLKTYNDNKGYHGFDKLDDATLAVVHELITYAEDVTLPMTQNLDFFVPNSSRYQALIGTQYHPNELIDVISMEDKQAPIIPITHKLTISKTVTGTIADKKKEFNFEIHLKSSDGQAISGTYPTNSGELTVTDGKATFTLKDGESLIVEGLPSGYSYEITETGASDYEVSVNGKNAPDGKATKASVKEDETVAFENRKDLVPPTGLTTDGAIYLWLLLLVPFGLLVWLFGRKGTKK

An E box containing a conserved glutamic residue has been identified inSpyM30098. The E-box motif is underlined in SEQ ID NO: 57, below. Theconserved glutamic acid (E), at amino acid residue 330, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of SpyM30098. Preferred fragments of SpyM30098include the conserved glutamic acid residue. Preferably, fragmentsinclude the E box motif. SEQ ID NO: 57MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEQSVPNKQSSVQDYPWYGYDSYSKGYPDYSPLKTYHNLKVNLDGSKEYQAYCFNLTKHFPSKSDSVRSQWYKKLEGTNENFIKLADKPRIEDGQLQQNILRILYNGYPNDRNGIMKGTDPLNATLVTQNAIWYYTDSSYISDTSKAFQQEETDLKLDSQQLQLMRNALKRLINPKEVESLPNQVPANYQLSIFQSSDKTFQNLLSAEYVPDTPPKPGEEPPAKTEKTSVIIRKYAEGDYSKLLEGATLKLAQIEGSGFQEKIFDSNKSGEKVELPNGTYVLSELKPPQGYGVATPITFKVAAEKVLIKNKEGQFVENQNKEIAEPYSVTAFNDFEEIGYLSDFNNYGKFYYAKNTNGTNQVVYCFNADLHSPPDSYDHGANIDPDVSESKEIKYTHVSGYDLYKYAATPRDKDADFFLKHIKKILDKGYKKKGDTYKTLTEAQFRAATQLAIYYYTDSADLTTLKTYNDNKGYHGFDKLDDATLAVVHELITYAEDVTLPMTQNLDFFVPNSSRYQALIGTQYHPNELIDVISMEDKQAPIIPITHKLTISKTVTGTIADKKKEFNFEIHLKSSDGQAISGTYPTNSGELTVTDGKATFTLKDGESLIVEGLPSGYSYEITETGASDYEVSVNGKNAPDGKATKASVKEDETVAFENRKDLVPPTGLTTDGAIYLWLLLLVPFGLLVWLFGRKGTKK

SpyM30099 is referred to as LepA. An example of an amino acid sequenceof SpyM30099 is set forth in SEQ ID) NO: 58. SEQ ID NO: 58MTNYLNRLNENPLLKAFIRLVLKISTIGFLGYILFQYVFGVMIVNTNQMSPAVSAGDGVLYYRLTDRYHINDVVVYEVDDTLKVGRIAAQAGDVENFTQEGGLLINGHPPEKEVPYLTYPHSSGPNFPYKVPTGTYFILNDYREERLDSRYYGALPINQIKGKISTLLRVRGI

SpyM30100 is thought to be a fimbrial protein. An example of an aminoacid sequence of SpyM30100 is set forth in SEQ ID NO: 59. SEQ ID NO: 59MKKNKLLLATAILATALGTASLNQNVKAETAGVSENAKLIVKKTFDSYTDNEVLMPKADYTFKVEADSTASGKTKDGLEIKPGIVNGLTEQIISYTNTDKPDSKVKSTEFDFSKVVFPGIGVYRYTVSEKQGDVEGITYDTKKWTVDVYVGNKEGGGFEPKFIVSKEQGTDVKKPVNFNNSFATTSLKVKKNVSGNTGELQKEFDFTLTLNESTNFKKDQIVSLQKGNEKFEVKIGTPYKFKLKNGESIQLDKLPVGITYKVNEMEANKDGYKTTASLKEGDGQSKMYQLDMEQKTDESADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

SpyM30100 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 140 QVPTG (shown in italics in SEQ ID NO: 59, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant SpyM30100 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

Two pilin motifs, discussed above, containing conserved lysine (K)residues have also been identified in SpyM30100. The pilin motifsequences are underlined in SEQ ID NO: 59, below. Conserved lysine (K)residues are also marked in bold, at amino acid residues 57 and 63 andat amino acid residues 161 and 166. The pilin sequences, in particularthe conserved lysine residues, are thought to be important for theformation of oligomeric, pilus-like structures. Preferred fragments ofSpyM30100 include at least one conserved lysine residue. Preferably,fragments include at least one pilin sequence. SEQ ID NO: 59MKKNKLLLATAILATALGTASLNQNVKAETAGVSENAKLIVKKTFDSYTD NEVLMPKADYTFKVEADSTASGKTKDGLEIKPGIVNGLTEQIISYTNTDKPDSKVKSTEFDFSKVVFPGIGVYRYTVSEKQGDVEGITYDTKKWTVDVYV GNKEGGGFEPKFIVS KEQGTDVKKPVNFNNSFATTSLKVKKNVSGNTGELQKEFDFTLTLNESTNFKKDQIVSLQKGNEKFEVKIGTPYKFKLKNGESIQLDKLPVGITYKVNEMEANKDGYKTTASLKEGDGQSKMYQLDMEQKTDESADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

Two E boxes, each containing a conserved glutamic residue, have beenidentified in SpyM30100. The E-box motifs are underlined in SEQ ID NO:59, below. The conserved glutamic acid (E) residues, at amino acidresidues 232 and 264, are marked in bold. The E box motifs, inparticular the conserved glutamic acid residues, are thought to beimportant for the formation of oligomeric pilus-like structures ofSpyM30100. Preferred fragments of SpyM30100 include at least oneconserved glutamic acid residue. Preferably, fragments include at leastone E box motif. SEQ ID NO: 59MKKNKLLLATAILATALGTASLNQNVKAETAGVSENAKLIVKKTFDSYTDNEVLMPKADYTFKVEADSTASGKTKDGLEIKPGIVNGLTEQIISYTNTDKPDSKVKSTEFDFSKVVFPGIGVYRYTVSEKQGDVEGITYDTKKWTVDVYVGNKEGGGFEPKFIVSKEQGTDVKKPVNFNNSFATTSLKVKKNVSGNTGELQKEFDFTLTLNESTNFKKDQIVSLQKGNEKFEVKIGTPYKFKLKNGESIQLDKLPVGITYKVNEMEANKDGYKTTASLKEGDGQSKMYQLDMEQKTDESADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

SpyM30101 is a SrtC2 type sortase. An example of an amino acid sequenceof SpyM30101 is set forth in SEQ ID NO: 60. SEQ ID NO: 60MTIVQVINKAIDTLILIFCLVVLFLAGFGLWDSYHLYQQADASNPKKFKTAQQQPKFEDLLALNEDVIGWLNIPGTHIDYPLVQGKTNLEYINKAVDGSVANSGSLFLDTRNHNDFTDDYSLIYGHHMAGNAMFGEIPKFLKKDFPSKHNKATIETKERKKLTVTIFACLKTDAFNQLVFNPNAITNQDQQRQLVDYISKRSKQFKPVKLKHHTKFVAFSTCENFSTDNRVIVVGTTQE

SpyM30102 is referred to as a hypothetical protein. An example of anamino acid sequence of SpyM30102 is set forth in SEQ ID NO: 61. SEQ IDNO: 61 MILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQASTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKWLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSRL

SpyM30102 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 185 LPLAG (shown in italics in SEQ ID NO: 61, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant SpyM30102 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in SpyM30102. The pilin motif sequenceis underlined in SEQ ID NO: 61, below. The conserved lysine (K) residueis also marked in bold, at amino acid residue 132. The pilin sequence,in particular the conserved lysine residues, are thought to be importantfor the formation of oligomeric, pilus-like structures. Preferredfragments of SpyM30102 include the conserved lysine residue. Preferably,fragments include the pilin sequence. SEQ ID NO: 61MILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKWLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSRL

Two E boxes containing conserved glutamic residues have been identifiedin SpyM30102. The E-box motifs are underlined in SEQ ID NO: 61, below.The conserved glutamic acid (E) residues, at amino acid residues 52 and122, are marked in bold. The E box motifs, in particular the conservedglutamic acid residues, are thought to be important for the formation ofoligomeric pilus-like structures of SpyM30102. Preferred fragments ofSpyM30102 include at least one conserved lysine residue. Preferably,fragments include at least one pilin sequence. SEQ ID NO: 61MILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQASTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKWLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSRL

SpyM30103 is referred to as a putative multiple sugar metabolismregulator. An example of an amino acid sequence for SpyM3103 is setforth in SEQ ID NO: 62. SEQ ID NO: 62MVRFDLKHVQTLHSLSQLPISVMSQDKALIQVYGNDDYLLCYYQFLKHLAIPQAAQDVIFYEGLFEESFMIFPLCHYIIAIGPGYPYSLNKDYQEQLANNCLKHSSHRSKEELLSYMALVPHFPINNVRNLLIAIDAFFDTQFETTCQQTIHQLLQHSKQMTADPDIIHRLKHISKASSQLPPVLEHLNHTMDLVKLGNPQLLKQETNRIPLSSITSSSISALRAEKNLTVIYLTRLLEFSFVENTDVAKHYSLVKYYMALNEEASDLLKVLRIRCAAIIHFSESLTNKSISDKRQMYNSVLHYVDSHLYSKLKVSDIAKRLYVSESHLRSVFKKYSNVSLQHYILSTKIKEAQLLLKRGIPVGEVAKSLYFYDTTHFHKIFKKYTGISSKDYLAKYRDN I

SpyM30104 is thought to be a F2 like fibronectic binding protein. Anexample of an amino acid sequence for SpyM30104 is set forth in SEQ IDNO: 63. SEQ ID NO: 63 MSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVISVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDTTEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEFGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEVATAITFTVNEQGQVTVNGKATKGKAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTKIEDSKSSDVIVGGQGQIVETTEDTQTGMHGDSGRKTEVEDTKLVQSFHEDNKEPESNSEIPKKDKSKSNTSLPATGEKQHNKFFWMVTSCSLISSVFVISLKSK KRLSSC

SpyM30104 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 180 LPATG (shown in italics in SEQ ID NO: 63, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant SpyM30104 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

Two pilin motifs, discussed above, containing conserved lysine (K)residues have also been identified in SpyM30104. The pilin motifsequences are underlined in SEQ ID NO: 63, below. Conserved lysine (K)residues are also marked in bold, at amino acid residues 156 and 227.The pilin sequences, in particular the conserved lysine residues, arethought to be important for the formation of oligomeric, pilus-likestructures. Preferred fragments of SpyM30104 include at least oneconserved lysine residue. Preferably, fragments include at least onepilin sequence. SEQ ID NO: 63MSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVISVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDTTEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEFGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEVATAITFTVNEQGQVTVNGKATKGKAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTKIEDSKSSDVIVGGQGQIVETTEDTQTGMHGDSGRKTEVEDTKLVQSFHEDNKEPESNSEIPKKDKSKSNTSLPATGEKQHNKFFWMVTSCSLISSVFVISLKSK KRLSSC

An E box containing a conserved glutamic residue has been identified inSpyM30104. The E-box motif is underlined in SEQ ID NO: 63, below. Theconserved glutamic acid (E), at amino acid residue 402, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of SpyM30104. Preferred fragments of SpyM30104include the conserved glutamic acid residue. Preferably, fragmentsinclude the E box motif. SEQ ID NO: 63MSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVISVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDITEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEFGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEVATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDTEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTKIEDSKSSDVIVGGQGQIVETTEDTQTGMHGDSGRKTEVEDTKLVQSFHFDNKEPESNSEIPKKDKSKSNTSLPATGEKQHNKFFWMVTSCSLISSVFVISLKSK KRLSSC

Examples of GAS AI-3 sequences from M3 strain isolate SSI-1 are setforth below.

Sps0099 is a negative transcriptional regulator (Nra). An example of anamino acid sequence for Sps0099 is set forth in SEQ ID NO: 64. SEQ IDNO: 64 MPYVKKKKDSPLVETYLEQSIRDKSELVLLLFKSPTIIFSHVAKQTGLTAVQLKYYCKELDDFFGNNLDITIKKGKIICCFVKPVKEFYLHQLYDTSTILKLLVFFIKNGTSSQPLIKFSKKYFLSSSSAYRLRESLIKLLREFGLRVSKNTIVGEEYRIRYLIAMLYSKFGIVIYPLDHLDNQIIYRFLSQSATNLRTSPWLEEPFSFYNMLLALSWKRHQFAVSIPQTRIFRQLKKLFIYDCLTRSSRQVIENAFSLTESQGDLEYLFLIYITTNNSFASLQWTPQHIETCCHIFEKNDTFRLLLEPILKRLPQLNHSKQDLIKALMYFSKSFLFNLQHFVIEIPSFSLPTYTGNSNLYKALKNIVNQWLAQLPGKRHLNEKHLQLFCSHIEQILKNKQPALTVVLTSSNFINAKLLTDTIPRYFSDKGIHFYSFYLLRDDIYQIPSLKPDLVTTHSRLIPFVKNDLVKGVTVAEFSFDNPDYSIASIQNLIYQLKDK KYQDFLNEQLQ

Sps0100 is thought to be a collagen binding protein (Cbp). It contains asortase substrate motif VPXTG shown in italics in SEQ ID NO: 65. SEQ IDNO: 65 MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEQSVPNKQSSVQDYPWYGYDSYSKGYPDYSPLKTYHNLKVNLDGSKEYQAYCFNLTKHFPSKSDSVRSQWYKKLEGTNENFIKLADKPRIEDGQLQQNILRILYNGYPNDRNGIMKGIDPLNAILVTQNAIWYYTDSSYISDTSKAFQQEETDLKLDSQQLQLMRNALKRLINPKEVESLPNQVPANYQLSIFQSSDKTFQNLLSAEYVPDTPPKPGEEPPAKTEKTSVIIRKYAEGDYSKLLEGATLKLAQIEGSGFQEKIFDSNKSGEKVELPNGTYVLSELKPPQGYGVATPITFKVAAEKVLIKNKEGQFVENQNKETAEPYSVTAFNDFEEIGYLSDFNNYGKFYYAKNTNGTNQVVYCFNADLHSPPDSYDHGANIDPDVSESKEIKYTHVSGYDLYKYAATPRDKDADFFLKHIKKILDKGYKKKGDTYKTLTEAQFRAATQLAIYYYTDSADLTTLKTYNDNKGYHGFDKLDDATLAVVHELITYAEDVTLPMTQNLDFFVPNSSRYQALIGTQYHPNELIDVISMEDKQAPIIPITHKLTISKTVTGTIADKKKEFNFEIHLKSSDGQAISGTYPTNSGELTVTDGKATFTLKDGESLIVEGLPSGYSYEITETGASDYEVSVNGKNAPDGKATKASVKEDETVAFENRKDLVPPTGLTTDGAIYLWLLLLVPFGLLVWLFGRKGTKK

Sps0101 is referred to as a LepA protein. An example of an amino acidsequence of Sps0101 is set forth as SEQ ID NO: 66 SEQ ID NO: 66MTNYLNRLNENPLLKAFIRLVLKISIIGFLGYILFQYVFGVMIVNTNQMSPAVSAGDGVLYYRLTDRYHINDVVVYEVDDTLKVGRIAAQAGDEVNFTQEGGLLINGHPPEKEVPYLTYPHSSGPNFPYKVPTGTYFILNDYREERLDSRYYGALPINQIKGKISTLLRVRGI

Sps0102 is thought to be a fimbrial protein. It contains a sortasesubstrate motif QVXTG shown in italics in SEQ ID NO: 67. SEQ ID NO: 67MEREKMKKNKLLLATAILATALGTASLNQNVKAETAGVSENAKLIVKKTFDSYTDNEVLMPKADYTFKVEADSTASGKTKDGLEIKPGIVNGLTEQIISYTNTDKPDSKVKSTEFDFSKVVFPGIGVYRYTVSEKQGDVEGITYDTKKWTVDVYVGNKEGGGFEPKFIVSKEQGTDVKKPVNFNNSFATTSLKVKKNVSGNTGELQKEFDFTLTLNESTNFKKDQIVSLQKGNEKFEVKIGTPYKFKLKNGESIQLDKLPVGITYKVNEMEANKDGYKTTASLKEGDGQSKMYQLDMEQKTDESADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

Sps0103 is a SrtC2 type sortase. An example of Sps0103 is set forth inSEQ ID NO: 68. SEQ ID NO: 68MVMTIVQVINKATDTLILTFGLVVLFLAGFGLWDSYHLYQQADASNFKKFKTAQQQPKFEDLLALNEDVIGWLNIPGTHIDYPLVQGKTNLEYINKAVDGSVAMSGSLFLDTRNHNDFTDDYSLTYGHHMAGNAMFGEIPKFLKKDFFSKHNKAIIETKERKKLTVTIFACLKTDAFNQLVFNPNAITNQDQQRQLVDYISKRSKQFKPVKLKHHTKFVAFSTCENFSTDNRVTVVGTIQE

Sps0104 is referred to as a hypothetical protein. It contains a sortasesubstrate motif LPXAG shown in italics in SEQ ID NO: 69. SEQ ID NO: 69MLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKWLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSRL

Sps0105 is referred to as a putative multiple sugar metabolismregulator. An example of Sps0105 is set forth in SEQ ID NO: 70. SEQ IDNO: 70 MALVPHFPINNVRNLLIAIDAFFDTQFETTCQQTIHQLLQHSKQMTADPDITHRLKHISKASSQLPPVLEHLNHIMDLVKLGNPQLLKQEINRIPLSSITSSSISALRAEKNLTVIYLTRLLEPSFVENTDVAKHYSLVKYYMALNEEASDLLKVLRIRCAAIIHFSESLTNKSISDKRQMYNSVLHYVDSHLYSKLKVSDIAKRLYVSESHLRSVEKKYSNVSLQHYTLSTKIKEAQLLLKRGIPVGEVAKSLYFYDTTHFHKIFKKYTGISSKDYLAKYRDNI

Sps0106 is thought to be a F2 like fibronectic binding protein. Itcontains a sortase substrate LPXTG (SEQ ID NO: 122) shown in italics inSEQ ID NO: 71. SEQ ID NO: 71MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGAFEIKKNKSQEEYNYEVYDNRNILQDGEHKLEIKRVDGTGKTYQGFCFQLTKNFPTAQGVSKKLYKKLSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDATWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVISVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPTIYFKLYRQLPGEKEVAVDDAELKQTNSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDITEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEFGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTTEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEVATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTKIEDSKSSDVIVGGQGQIVETTEDTQTGMHGDSGRKTEVEDTKLVQSFHFDNKEPESNSEIPKKDKSKSNTSLPATGEKQHNKFFWMVTSCSLISSVFVISLKSKKRLSSC

Examples of GAS AI-3 sequences from M5 isolate Manfredo are set forthbelow.

Orf 77 encodes a negative transcription regulator (Nra). An example ofthe nucleotide sequence encoding Nra (SEQ ID NO: 88) and an Nra aminoacid sequence (SEQ ID NO: 89) are set forth below. SEQ ID NO: 88ATGCCTTATGTCAAAAAGAAAAAGGATAGTTTCTTAGTAGAAACATATCTTGAACAGTCTATTAGAGATAAAAGTGAATTAGTCTTACTGTTATTTAAATCGCCTACTATCATTTTTTCTCATGTTGCTAAACAAACTGGTCTGACGGCTGTACAATTAAAATATTACTGTAAAGAACTTGATGACTTTTTTGGAAATAATTTAGACATTACCATTAAAAAGGGCAAAATAATATGTTGTTTTGTCAAACCTGTTAAGGAATTCTACCTTCATCAACTCTATGACACATCAACAATATTAAAATTATTAGTTTTCTTTATTAAAAATGGAACGTCATCACAACCTCTGATTAAATTTTCAAAAAAGTATTTTCTATCAAGCTCCTCAGCTTATCGACTACGGGAATCGCTGATCAAATTACTACGGGAATTTGGCTTGAGAGTCTCAAAAAATACAATTGTCGGAGAGGAATATCGTATTCGCTATCTTATTGCCATGCTATATAGTAAATTTGGCATTGTCATCTATCCGTTAGATCATCTAGACAATCAAATTATTTATCGCTTCTTATCACAAAGTGCAACCAATTTAAGAACATCGCCCTGGCTAGAGGAACCTTTTTCTTTTTATAATATGTTACTTGCCTTGTCATGGAAACGTCACCAATTTGCAGTTAGCATTCCTCAAACACGTATTTTTCGACAATTAAAAAAGCTTTTTATCTATGATTGTTTAACTCGAAGCAGTCGACAAGTAATCGAAAATGCTTTTTCGTTAATGTTCTCACAAGGAGATCTCGATTATCTTTTTTTAATTTATATTACCACCAATAATTCCTTTGCCAGCCTACAATGGACTCCACAGCATATTGAAACTTGCTGCCATATTTTTGAAAAAAATGACACATTTCGGTTATTGTTAGAGCCCATTCTTAAACGTTTACCGCAATTAAACCATTCTAAACAAGACCTTATTAAAGCCCTTATGTATTTTTCAAAATCTTTTCTATTTAACCTCCAACATTTCGTCATCGAGATTCCTTCTTTTTCCTTGCCGACCTATACAGGCAACTCTAATCTTTACAAAGCTTTAAAAAATATTGTAAATCAGTGGCTTGCTCAATTACCCGGAAAGCGTCATCTTAACGAAAAGCATCTCCAACTTTTTTGCTCTCATATTGAACAAATCTTAAAAAATAAACAACCTGCTTTAACTGTCGTTTTAATATCTAGTAACTTTATAAATGCTAAACTCCTTACAGATACTATCCCACGATATTTTTCTGATAAAGGAATTCATTTTTATTCTTTTTACTTATTAAGAGATGATATCTATCAAATTCCAAGCTTAAAACCAGATTTAGTTATCACTCATAGCCGATTAATTCCTTTTGTTAAGAATGATCTGGTCAAAGGTGTTACTGTTGCTGAATTTTCTTTTGATAACCCTGACTACTCTATTGCTTCAATTCAAAACTTGATATATCAGCTCAAAGATAAAAAATATCAAGATTTTCTAAACGAGCAATTACAA SEQ ID NO: 89MPYVKKKKDSFLVETYLEQSIRDKSELVLLLFKSPTIIFSHVAKQTGLTAVQLKYYGKELDDFPGNNLDITIKKGKIICCFVKPVKEFYLHQLYDTSTILKLLVFFIKNGTSSQPLIKFSKKYFLSSSSAYRLRESLIKLLREFGLRVSKNTIVGEEYRIRYLIAMLYSKFGIVIYPLDHLDNQIIYRFLSQSATNLRTSPWLEEPFSFYNMLLALSWKRHQFAVSIPQTRIFRQLKKLFIYDCLTRSSRQVIENAFSLMFSQGDLDYLFLIYITTNNSFASLQWTPQHIETCCHIFEKNDTFRLLLEPILKRLPQLNHSKQDLIKALMYFSKSFLFNLQHFVIEIPSFSLPTYTGNSNLYKALKNIVNQWLAQLPGKRHLNEKHLQLFCSHIEQILKNKQPALTVVLISSNFINAKLLTDTIPRYFSDKGIHFYSEYLLRDDIYQIPSLKPDLVITHSRLIPFVKNDLVKGVTVAEFSFDNPDYSIASIQNLIYQLKDK KYQDFLNEQLQ

Orf 78 is thought to be a collagen binding protein (Cbp). An example ofthe nucleotide sequence encoding Cbp (SEQ ID NO: 90) and a Cbp aminoacid sequence (SEQ ID NO: 91) are set forth below. SEQ ID NO: 90TTGCAAAAGAGGGATAAAACCAATTATGGAAGCGCTAACAACAAACGACGACAAACGACGATCGGATTACTGAAAGTATTTTTGACGTTTGTAGCTCTGATAGGAATAGTAGGGTTTTCTATCAGAGCGTTCGGAGCTGAAGAAAAATCTACTGAAACTAAAAAAACGTCAGTCATTATTAGAAAATATGCTGAAGGTGACTACTCTAAACTTCTAGAGGGAGCAACTTTGCGTTTAACAGGGGAAGATATCCCAGATTTTCAAGAAAAAGTCTTCCAAAGTAATGAACAGGAGAAAAAGATTGAATTATCAAATGGGACTTATACCTTAACAGAAACATCATCTCCAGATGGATATAAAATTACGGAGCCGATTAAGTTTAGAGTAGTGAATAAAAAAGTATTTATCGTCCAAAAAGATGGTTCTCAAGTGGAAAACCCAAACAAAGAACTAGGTTCTCCATATACTATAGAGGCATACAATGATTTTGATGAATTTGGCTTACTGTCAACACAAAATTATGCGAAATTTTATTATGGAAAAAACTATGATGGCAGTTCACAAATTGTTTATTGCTTCAATGCCAACTTGAAATCTCCACCTGACTCGGAAGATCATGGTGCTACAATAAATCCTGACTTTACGACTGGTGATATTAGGTACAGTCATATTGCTGGTTCAGATTTGATAAAATACGCTAATACAGCTAGGGATGAAGATCCTCAATTATTTTTAAAACACGTAAAAAAAGTAATTGAAAATGGGTATCATAAAAAAGGTCAAGCTATTCCATATAACGGTCTGACTGAGGCACAGTTTCGTGCGGCTACTCAACTGGCAATTTATTATTTTACAGATAGTGTTGACTTAACTAAGGATAGATTGAAAGACTTCCATGGATTTGGAGATATGAATGATCAAACTTTGGGTGTAGCTAAAAAAATTGTAGAATACGCTTTGAGTGATGAAGATTCAAAACTAACAAATCTTGATTTCTTCGTACCTAATAATAGCAAATACCAATCTCTTATTGGGACAGAATACCATCCAGATGATTTGGTTGACGTGATTCGTATGGAAGATAAAAAGCAAGAAGTTATTCCAGTAACTCATAGTTTGACGGTGCAAAAAACAGTAGTCGGTGAGTTGGGAGATAAGACTAAAGGCTTTCAATTTGAACTTGAGTTGAAAGATAAAACTGGACAGCCTATTGTTAACACTCTAAAAACTAATAATCAAGATTTAGTAGCTAAAGATGGGAAATATTCATTTAATCTAAAGCATGGTGACACCATAAGAATAGAAGGATTACCGACGGGATATTCTTATACCCTGAAAGAGACTGAAGCTAAGGATTATATAGTAACTGTTGATAACAAAGTTAGTCAAGAAGCTCAATCAGCAAGTGAGAATGTCACAGCAGACAAAGAAGTCACTTTTGAAAACCGAAAAGATCTTGTCCCACCAACTGGTTTGACAACAGATGGGGCTATCTATCTTTGGTTATTACTACTTGTTCCATTTGGGTTATTGGTTTGGCTATTTGGTCG TAAAGGGTTAAAAAATGACSEQ ID NO: 91 MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEKSTETKKTSVIIRKYAEGDYSKLLEGATLRLTGEDIPDFQEKVFQSNGTGEKIELSNGTYTLTETSSPDGYKITEPEKFRVVNKKVFIVQKDGSQVENPNKELGSPYTIEAYNDFDEFGLLSTQNYAKFYYGKNYDGSSQIVYCFNANLKSPPDSEDHGATINPDFTTGDIRYSHIAGSDLIKYANTARDEDPQLFLKHVKKVIENGYHKKGQATPYNGLTEAQFRAATQLAIYYETDSVDLTKDRLKDPHGFGDMNDQTLGVAKKIVEYALSDEDSKLTNLDFFVPNNSKYQSLIGTEYHPDDLVDVIRMEDKKQEVTPVTHSLTVQKTVVGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEGLPTGYSYTLKETEAKDYIVTVDNKVSQEAQSASENVTADKEVTFENRKDLVPPTGLTTDGAIYLWLLLLVPFGLLVWLFGRKGLKND

Orf 78 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 184 VPPTG (shown in italics in SEQ ID NO: 91, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant Orf 78 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

Three E boxes containing conserved glutamic residues have beenidentified in Orf 78. The E-box motifs are underlined in SEQ ID NO: 91,below. The conserved glutamic acid (E) residues, at amino acid residues112, 395, and 447, are marked in bold. The E box motifs, in particularthe conserved glutamic acid residues, are thought to be important forthe formation of oligomeric pilus-like structures of Orf 78. Preferredfragments of Orf 78 include at least one conserved glutamic acidresidue. Preferably, fragments include at least one E box motif. SEQ IDNO: 91 MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEKSTETKKTSVIIRKYAEGDYSKLLEGATLRLTGEDIPDFQEKVFQSNGTGEKIELSNGTYTLTETSSPDGYKITEPEKFRVVNKKVFIVQKDGSQVENPNKELGSPYTIEAYNDFDEFGLLSTQNYAKFYYGKNYDGSSQIVYCFNANLKSPPDSEDHGATINPDFTTGDIRYSHIAGSDLIKYANTARDEDPQLFLKHVKKVIENGYHKKGQATPYNGLTEAQFRAATQLAIYYETDSVDLTKDRLKDPHGFGDMNDQTLGVAKKIVEYALSDEDSKLTNLDFFVPNNSKYQSLIGTEYHPDDLVDVIRMEDKKQEVTPVTHSLTVQKTVVGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEGLPTGYSYTLKETEAKDYIVTVDNKVSQEAQSASENVTADKEVTFENRKDLVPPTGLTTDGAIYLWLLLLVPFGLLVWLFGRKGLKND

Orf 79 is thought to be a LepA signal peptidase I. An example of thenucleotide sequence encoding a LepA signal peptidase I (SEQ ID NO: 92)and a LepA signal peptidase I amino acid sequence (SEQ ID NO: 93) areset forth below. SEQ ID NO: 92ATGACTAATTACCTAAATCGTTTAAATGAGAATTCACTATTTAAAGCTTTCATACGGTTAGTACTTAAGATTTCTATTATTGGGTTTCTAGGTTACATTCTATTTCAGTATGTTTTTGGTGTTATGATTATTAACACTAATGATATGAGTCCTGCTTTAAGTGCAGGTGACGGTGTTTTATATTATCGTTTGACTGATCGCTATCATATTAATGATGTGGTGGTCTATGAGGTTGATAACACTTTGAAAGTTGGTCGAATTGTCGCTCAAGCTGGCGATGAGGTTAGTTTTACGCAAGAAGGAGGACTGTTGATTAATGGGCATCCACCAGAAAAAGAGGTCCCTTACCTGACGTATCCTCACTCAAGTGGCCCAAACTTTCCCTATAAAGTTCCTACGGGTAAGTATTTCATATTGAATGATTATCGTGAAGAACGTTTGGACAGTCGTTATTATGGGGCGTTACCCGTCAATCAAATAAAAGGGAAAATCTCAACTCT ATTAAGAGTGAGAGGAATTSEQ ID NO: 93 MTNYLNRLNENSLFKAFIRLVLKISIIGFLGYTLFQYVFGVMIINTNDMSPALSAGDGVLYYRLTDRYHINDVVVYEVDNTLKVGRTVAQAGDEVSFTQEGGLLINGHPPEKEVPYLTYPHSSGPNFPYKTPTGKYFILNDYREERLDSRYYGALPVNQIKGKISTLLRVRGI

Orf 80 is thought to to be a fimbrial protein. An example of thenucleotide sequence encoding the fimbrial protein (SEQ ID NO: 94) and afimbrial protein amino acid sequence (SEQ ID NO: 95) are set forthbelow. SEQ ID NO: 94 TTGGAGAGAGAAAAAATGAAAAAAAACAAATTATTACTTGCTACTGCAATCTTAGCAACTGCTTTAGGAACAGCTTCTTTAAATCAAAACGTAAAAGCTGAGACGGCAGGGGTTGTAACAGGAAAATCACTACAAGTTACAAAGACAATGACTTATGATGATGAAGAGGTGTTAATGCCCGAAACCGCCTTTACTTTTACTATAGAGCCTGATATGACTGCAAGTGGAAAAGAAGGCAGCCTAGATATTAAAAATGGAATTGTAGAAGGCTTAGACAAACAAGTAACAGTAAAATATAAGAATACAGATAAAACATCTCAAAAAACTAAAATAGCACAATTTGATTTTTCTAAGGTTAAATTTCCAGCTATAGGTGTTTACCGCTATATGGTTTCAGAGAAAAACGATAAAAAAGACGGAATTACGTACGATGATAAAAAGTGGACTGTAGATGTTTATGTTGGGAATAAGGCCAATAACGAAGAAGGTTTCGAAGTTCTATATATTGTATCAAAAGAAGGTACTTCTAGTACTAAAAAACCAATTGAATTTACAAACTCTATTAAAACTACTTCCTTAAAAATTGAAAAACAAATAACTGGCAATGCAGGAGATCGTAAAAAATCATTCAACTTCACATTAACATTACAACCAAGTGAATATTATAAAACTGGATCAGTTGTGAAAATCGAACAGGATGGAAGTAAAAAAGATGTGACGATAGGAACGCCTTACAAATTTACTTTGGGACACGGTAAGAGTGTCATGTTATCGAAATTACCAATTGGTATCAATTACTATCTTAGTGAAGACGAAGCGAATAAAGACGGCTACACTACAACGGCAACATTAAAAGAACAAGGCAAAGAAAAGAGTTCCGATTTCACTTTGAGTACTCAAAACCAGAAAACAGACGAATCTGCTGACGAAATCGTTGTCACAAATAAGCGTGACACTCAAGTTCCAACTGGTGTTGTAGGGACCCTTGCTCCATTTGCAGTTCTTAGCATTGTGGCTATTGGTGGAGTTATCTATATTACAAAACGTAAA AAAGCT SEQ ID NO: 95MEREKMKKNKLLLATAILATALGTASLNQNVKAETAGVVTGKSLQVTKTMTYDDEEVLMPETAFTPTIEPDMTASGKEGSLDIKNGIVEGLDKQVTVKYKNTDKTSQKTKIAQFDFSKVKFPAIGVYRYMVSEKNDKKDGITYDDKKWTVDVYVGNKANNEEGPEVLYIVSKEGTSSTKKPIEFTNSIKTTSLKIEKQITGNAGDRKKSFNFTLTLQPSEYYKTGSVVKIEQDGSKKDVTIGTPYKFTLGHGKSVMLSKLPIGINYYLSEDEANKDGYTTTATLKEQGKEKSSDFTLSTQNQKTDESADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRK KA

Orf 82 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 140 QVPTG (shown in italics in SEQ ID NO: 95, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant Orf 82 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

An E box containing a conserved glutamic residue has been identified inOrf 80. The E-box motif is underlined in SEQ ID NO: 95, below. Theconserved glutamic acid (E), at amino acid residue 270, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of Orf 80. Preferred fragments of Orf 80 includeat least one conserved glutamic acid residue. Preferably, fragmentsinclude at least one E box motif. SEQ ID NO: 95MEREKMKKNKLLLATAILATALGTASLNQNVKAETAGVVTGKSLQVTKTMTYDDEEVLMPETAFTPTIEPDMTASGKEGSLDIKNGIVEGLDKQVTVKYKNTDKTSQKTKIAQFDFSKVKFPAIGVYRYMVSEKNDKKDGITYDDKKWTVDVYVGNKANNEEGPEVLYIVSKEGTSSTKKPIEFTNSIKTTSLKIEKQITGNAGDRKKSFNFTLTLQPSEYYKTGSVVKIEQDGSKKDVTIGTPYKFTLGHGKSVMLSKLPIGINYYLSEDEANKDGYTTTATLKEQGKEKSSDFTLSTQNQKTDESADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRK KA

Orf 81 is thought to to be a SrtC2 type sortase. An example of thenucleotide sequence encoding the SrtC2 sortase (SEQ ID NO: 96) and aSrtC2 sortase amino acid sequence (SEQ ID NO: 97) are set forth below.SEQ ID NO: 96 GTGATTAGTCAAAGAATGATGATGACAATTGTACAGGTTATCAATAAAGCCATTGATACTCTCATTCTTATCTTTTGTTTAGTCGTACTATTTTTAGCTGGTTTTGGTTTGTGGGATTCTTATCATCTCTATCAACAAGCAGACGCTTCTAATTTCAAAAAATTTAAAACAGCTCAACAACAGCCTAAATTTGAAGACTTGTTAGCTTTGAATGAGGATGTCATTGGTTGGTTAAATATCCCAGGGACTCATATTGATTATCCTCTAGTTCAGGGAAAAACGAATTTAGAGTATATTAATAAAGCAGTTGATGGCAGTGTTGCCATGTCTGGTAGTTTATTTTTAGATACACGGAATCATAATGATTTTACGGACGATTACTCTCTGATTTATGGCCATCATATGGCAGGTAATGCCATGTTTGGCGAAATTCCAAAATTTTTAAAAAAGGATTTTTTCAACAAACATAATAAAGCTATCATTGAAACAAAAGAGAGAAAAAAACTAACCGTCACTATTTTTGCTTGTCTCAAGACAGATGCCTTTGACCAGTTAGTTTTTAATCCTAATGCTATTACCAATCAAGACCAACAAAAGCAGCTCGTTGATTATATCAGTAAAAGATCAAAACAATTTAAACCTGTTAAATTGAAGCATCATACAAAGTTCGTTGCTTTTTCAACGTGTGAAAATTTTTCTACTGACAATCGTGTTATCGTTGTCGGTACTATTCAAGAA SEQ ID NO: 97MISQRMMMTIVQVINKAIDTLILIFCLVVLFLAGFGLWDSYHLYQQADASNFKKFKTAQQQPKFEDLLALNEDVIGWLNIPGTHIDYPLVQGKTNLEYINKAVDGSVAMSGSLFLDTRNHNDFTDDYSLIYGHHMAGNAMFGEIPKFLKKDFFNKHNKAIIETKERKKLTVTIFACLKTDAFDQLVFNPNAITNQDQQKQLVDYISKRSKQFKPVKLKHHTKFVAFSTCENFSTDNRVIVVGTIQE

Orf 82 is referred to as a hypothetical protein. It contains a sortasesubstrate motif LPXAG shown in italics in SEQ ID NO: 99. An example ofthe nucleotide sequence encoding the hypothetical protein (SEQ ID NO:98) and a hypothetical protein amino acid sequence (SEQ ID NO: 99) areset forth below. SEQ ID NO: 98TTGCTTTTTCAACGTGTGAAAATTTTTCTACTGACAATCGTGTTATCGTTGTCGGTACTATTCAAGAATAACGAAAGGAGGAGACTTTTGAGAAAATATTGGAAAATGTTATTTTCTGTCGTAATGATATTAACCATGCTGGCCTTTAATCAGACTGTTTTAGCAAAAGACAGCACTGTTCAAACTAGCATTAGTGTCGAAAATGTCTTAGAGAGAGCAGGCGATAGTACCCCATTTTCGGTTGCATTAGAATCAATTGATGCGATGAAAACAATAGACGAAATAACAATTGCTGGTTCTGGAAAAGCAAGCTTTTCCCCTCTGACCTTCACAACAGTTGGGCAATATACTTATCGTGTTTATCAGAAGCCTTCACAAAATAAAGATTATCAAGCAGATACTACTGTATTTGACGTTCTTGTCTATGTGACCTATGATGAAGATGGGACTCTAGTCGCAAAAGTTATTTCTCGAAGGGCTGGAGACGAAGAAAAATCAGCGATTACTTTTAAGCCCAAACGGTTAGTAAAACCAATACCGCCTAGACAACCTAACATCCCTAAAACCCCATTACCATTAGCTGGTGAAGTAAAAAGTTTATTGGGTATCTTAAGTATCGTATTACTGGGGTTACTAGTTCTTCTTTATGTTAAAAAACTGAAGAGTAGGCTA SEQ ID NO: 99MLFQRVKIFLLTIVLSLSVLFKNNERRRLLRKYWKMLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLEPAGDSTPFSVALESIDAMKTIDEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSTVLLGLLVLLYVKKLKSRL

Orf 82 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 185 LPLAG (shown in italics in SEQ ID NO: 99, above). In somerecombinant host cell systems, it may be preferable to remove this motifto facilitate secretion of a recombinant Orf 82 protein from the hostcell. Alternatively, in other recombinant host cell systems, it may bepreferable to use the cell wall anchor motif to anchor the recombinantlyexpressed protein to the cell wall. The extracellular domain of theexpressed protein may be cleaved during purification or the recombinantprotein may be left attached to either inactivated host cells or cellmembranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in Orf 82. The pilin motif sequence isunderlined in SEQ ID NO: 99, below. Conserved lysine (K) residues arealso marked in bold, at amino acid residues 173 and 188. The pilinsequence, in particular the conserved lysine residues, are thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of Orf 82 include at least one conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO: 99MLFQRVKIFLLTIVLSLSVLFKNNERRRLLRKYWKMLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSVALESIDAMKTIDEITTAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSRL

An E box containing a conserved glutamic residue has been identified inOrf 82. The E-box motif is underlined in SEQ ID NO: 99, below. Theconserved glutamic acid (E), at amino acid residue 163, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of Orf 82. Preferred fragments of Orf 82 includethe conserved glutamic acid residue. Preferably, fragments include the Ebox motif. SEQ ID NO: 99MLFQRVKIFLLTIVLSLSVLFKNNERRRLLRKYWKMLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLEPAGDSTPFSVALESIDAMKTIDEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSTVLLGLLVLLYVKKLKSRL

Orf 83 is thought to to be a multiple sugar metabolism regulatorprotein. An example of a nucleotide sequence encoding the sugarmetabolism regulator protein (SEQ ID NO: 100) and a sugar metabolismregulator protein amino acid sequence (SEQ ID NO: 101) are set forthbelow. SEQ ID NO: 100 ATGATACAACTAAGGATGGGGGCAATCTATCAAATGGTTATATTCGATTTAAAACATGTGCAAACATTACACAGCTTGTCTCAATTACCTATTTCAGTGATGTCACAAGATAAGGCACTTATTCAAGTATATGGTAATGACGACTATTTATTATGTTACTATCAATTTTTAAAGCATCTAGCTATTCCTCAAGCTGCACAAGATGTTATTTTTTATGAGGGTTTATTTGAAGAGTCCTTTATGATTTTTCCTCTTTGTCACTACATTATTGCCATTGGACCTTTCTATCCTTATTCACTTAATAAAGACTATCAGGAACAATTAGCTAATAATTTTTTAAAACATTCTTCTCATCGTAGCAAAGAAGAGCTCTTGTCCTATATGGCACTTGTCCCACATTTTCCAATTAATAATGTGCGGAACCTTTTGATAGCTATTGACGCTTTTTTTGACACACAATTTGAGACGACTTGCCAACAAACGATTCATCAATTGTTGCAGCATTCAAAACAGATGACTGCTGATCCTGATATCATTCATCGCCTTAAGCATATTAGCAAAGCATCTAGCCAATTACCGCCTGTTTTAGAGCACCTAAATCATATTATGGATCTGGTAAAGCTAGGCAATCCACAATTGCTCAAGCAAGAAATCAATCGCATCCCCTTATCAAGTATCACCTCATCTTCTATTTCTGCTCTAAGGGCGGAAAGAACCTCACTGTTATCTATTTCAACTAGGTTACTGGAATTCAGTTTTGTAGAAAATACTGACGTAGCAAAGCATTATAGCCTTGTCAAATACTACATGGCCTTAAATGAAGAAGCGAGTGACTTGCTCAAAGTTTTGAGAATTCGCTGTGCAGCTATCATCCATTTTTCCGAATCATTAACCAATAAAAGTATTTCTGATAAACGTCAAATGTACAATAGTGTGCTTCATTATGTCGATAGTCACCTGTATTCCAAATTAAAGGTATCTGATATCGCTAAGCGCCTATATGTTTCCGAATCTCACTTACGTTCAGTCTTTAAAAAATACTCAAATGTTTCCTTACAACATTATATTCTAAGTACAAAAATCAAAGAAGCTCAACTACTCTTAAAACGAGGAATTCCTGTTGGAGAAGTGGCTAAAAGCTTATATTTTTATGACACTACCCATTTTCATAAAATCTTTAAAAAATACACGGGTATTTCTTCAAAAGACTATCTTGCTAAATACCGAGATAATATT SEQ ID NO: 101MIQLRMGAIYQMVIFDLKHVQTLHSLSQLPISVMSQDKALIQVYGNDDYLLCYYQFLKHLAIPQAAQDVIFYEGLFEESFMIFPLCHYIIAIGPFYPYSLNKDYQEQLANNFLKHSSHRSKEELLSYMALVPHFPINNVRNLLIAIDAFFDTQFETTCQQTIHQLLQHSKQMTADPDIIHRLKHISKASSQLPPVLEHLNHIMDLVKLGNPQLLKQEINRIPLSSITSSSISALRAEKNLTVIYLTRLLEFSFVENTDVAKHYSLVKYYMALNEEASDLLKVLRIRCAAIIHFSESLTNKSISDKRQMYNSVLHYVDSHLYSKLKVSDIAKRLYVSESHLRSVFKKYSNVSLQHYILSTKIKEAQLLLKRGIPVGEVAKSLYFYDTTHFHKIFKKYTGTS SKDYLAKYRDNI

Orf 84 is thought to to be a F2-like fibronectin-binding protein. Anexample of a nucleotide sequence encoding the F2-likefibronectin-binding protein (SEQ ID NO: 102) and a F2-likefibronectin-binding protein amino acid sequence (SEQ ID NO: 103) are setforth below. SEQ ID NO: 102ATGACACAAAAAAATAGCTATAAGTTAAGCTTCCTGTTATCCCTAACAGGATTTATTTTAGGTTTATTATTGGTTTTTATAGGATTGTCCGGAGTATCAGTAGGACATGCGGAAACAAGAAATGGAGCAAACAAACAAGGAGCTTTTGAAATCAAGAAAAATAAAAGTCAAGAAGAATATAATTATGAAGTTTATGATAACAGAAACATACTTCAGGATGGGGAACATAAACTTGAAATAAAAAGAGTTGATGGGACAGGTAAAACTTATCAAGGTTTTTGCTTTCAGTTAACGAAAAATTTTCCCACTGCTCAAGGTGTAAGTAAAAAGCTGTATAAAAAATTGAGTAGTAGTGATGAAGAAACACTAAAGCAATATGCCTCTAAGTATACAAGTAATAGGAGAGGAGATACTAGTGGTAATCTTAAAAAGCAAATTGCTAAGGTTCTGACAGAAGGTTACCCAACTAACAAAAGTGATTGGTTAAATGGATTGACTGAAAACGAAAAAATAGAAGTAACCCAGGATGCAATTTGGTATTTTACAGAAACGACAGTTCCGGCTGATAGAAGTTATACGAATCGCAACGTAAATAGTCAAAAAATGAAAGAAGTGTATCAAAAGCTAATTGATACAACAGATATAGATAAATATGAAGATGTACAATTTGATTTATTTGTGCCACAAGATACAAACTTACAGGCAGTAATTAGTGTAGAGCCTGTTATCGAAAGCCTTCCTTGGACATCGTTGAAGCCAATAGCCCAGAAGGATATCACTGCCAAAAAAATCTGGGTAGATGCACCTAAAGAAAAACCAATTATTTATTTTAAGCTATATAGACAGCTGCCTGGAGAAAAGGAAGTAGCAGTGGATGACGCTGAGCTAAAACAGATAAATAGTGAAGGTCAACAAGAAATATCAGTAACTTGGACAAATCAACTTGTTACAGATGAAAAAGGAATGGCTTACATTTATTCTGTAAAAGAAGTAGATAAAAATGGCGAGTTACTTGAGCCAAAAGATTATATCAAGAAGGAAGATGGACTTACAGTTACTAATACTTATGTAAAGCCAACTAGTGGGCACTATGATATAGAAGTGACATTTGGAAATGGACATATTGATATTACAGAAGATACTACACCAGATATTGTTTCAGGTGAAAACCAAATGAAGCAAATAGAGGGAGAAGATAGTAAGCCTATTGATGAAGTAACGGAAAATAATTTAATTGAATTTGGTAAAAACACGATGCCAGGTGAAGAAGATGGCACAAATTCTAATAAGTATGAAGAAGTCGAAGACTCACGCCCAGTTGATACCTTGTCAGGTTTATCAAGTGAGCAAGGTCAGTCCGGTGATATGACAATTGAAGAAGATAGTGCTACCCATATTAAATTCTCAAAACGTGATATTGACGGCAAAGAGTTAGCTGGTGCAACTATGGAGTTGCGTGATTCATCTGGTAAAACTATTAGTACATGGATTTCAGATGGACAAGTGAAAGATTTCTACCTGATGCCAGGAAAATATACATTTGTCGAAACCGCAGCACCAGACGGTTATGAGATAGCAACTGCTATTACCTTTACAGTTAATGAGCAAGGTCAGGTTACTGTAAATGGCAAAGCAACTAAAGGTGACGCTCATATTGTCATGGTTGATGCTTACAAGCCAACTAAGGGTTCAGGTCAGGTTATTGATATTGAAGAAAAGCTTCCAGACGAGCAGGGCCATTCTGGCTCAACTACTGAAATAGAAGATAGCAAGTCTTCAGACGTTATCATTGGTGGTCAGGGGCAGATTGTCGAGACAACAGAGGATACCCAAACTGGCATGCACGGGGATTCTGGTTGTAAAACGGAAGTCGAAGATACTAAACTAGTACAATCCTTCCACTTTGATAACAAGGAATCAGAAAGTAACTCTGAGATTCCTAAAAAAGATAAGCCAAAGAGTAATACTAGTTTACCAGCAACTGGTGAGAAGCAACATAATATGTTCTTTTGGATGGTTACTTCTTGCTCACTTATTAGTAGTGTTTTTGTAATATCACTAAAAACTAAAAAACGCCTATCATCATGT SEQ ID NO: 103MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGAFEIKKNKSQEEYNYEVYDNRNILQDGEHKLEIKRVDGTGKTYQGFCFQLTKNFPTAQGVSKKLYKKLSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVISVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDITEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEFGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGQIVETTEDTQTGMHGDSGCKTEVEDTKLVQSFHFDNKESESNSEIPKKDKPKSNTSLPATGEKQHNMFFWMVTSCSLISSVFVISLKTKKRLSSC

Orf 84 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 181 LPATG (shown in italics in SEQ ID NO: 103, above). Insome recombinant host cell systems, it may be preferable to remove thismotif to facilitate secretion of a recombinant Orf 84 protein from thehost cell. Alternatively, in other recombinant host cell systems, it maybe preferable to use the cell wall anchor motif to anchor therecombinantly expressed protein to the cell wall. The extracellulardomain of the expressed protein may be cleaved during purification orthe recombinant protein may be left attached to either inactivated hostcells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in Orf 84. The pilin motif sequence isunderlined in SEQ ID NO: 103, below. A conserved lysine (K) residue isalso marked in bold, at amino acid residue 270. The pilin sequence, inparticular the conserved lysine residue, is thought to be important forthe formation of oligomeric, pilus-like structures. Preferred fragmentsof Orf 84 include the conserved lysine residue. Preferably, fragmentsinclude the pilin sequence. SEQ ID NO: 103MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGAFEIKKNKSQEEYNYEVYDNRNILQDGEHKLEIKRVDGTGKTYQGFCFQLTKNFPTAQGVSKKLYKKLSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVTSVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDITEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEPGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGQIVETTEDTQTGMHGDSGCKTEVEDTKLVQSFHFDNKESESNSEIPKKDKPKSNTSLPATGEKQHNMFEWMVTSCSLISSVFVISLKTKKRLSSC

An E box containing a conserved glutamic residue has been identified inOrf 84. The E-box motif is underlined in SEQ ID NO: 103, below. Theconserved glutamic acid (E), at amino acid residue 516, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of Orf 84. Preferred fragments of Orf 84 includethe conserved glutamic acid residue. Preferably, fragments include the Ebox motif. SEQ ID NO: 103MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGAFEIKKNKSQEEYNYEVYDNRNILQDGEHKLEIKRVDGTGKTYQGFCFQLTKNFPTAQGVSKKLYKKLSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVTSVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDITEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEPGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGQIVETTEDTQTGMHGDSGCKTEVEDTKLVQSFHFDNKESESNSEIPKKDKPKSNTSLPATGEKQHNMFEWMVTSCSLISSVFVISLKTKKRLSSC

Examples of GAS AI-3 sequences from M18 strain isolate MGAS 8232 are setforth below.

SpyM18_(—)0125 is a negative transcriptional regulator (Nra). An exampleof SpyM18_(—)0125 is set forth in SEQ ID NO: 72. SEQ ID NO: 72MPYVKKKKDSFLVETYLEQSIRDKSELVLLLFKSPTIIFSHVAKQTGLTAVQLKYYCKELDDFFGNNLDITIKKGKIICCFVKPVKEFYLHQLYDTSTILKLLVFFIKNGTTSQPLIKFSKKYFLSSSSAYRLRESLIKLLREFGLRVSKNTIVGEEYRIRYLIAMLYSKFGIVIYPLDHLDNQIIYRFLSQSATNLRTS PWLEEPFSFYNMLLALS

SpyM18_(—)0126 is thought to be a collagen binding protein (CBP). Anexample of SpyM18_(—)0126 is set forth in SEQ ID NO: 73. SEQ ID NO: 73MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEQSTETKKTSVIIRKYAEGDYSKLLEGATLKLAQIEGSGFQEQSFESSTSGQKLQLSDGTYILTETKSPQGYEIAEPITFKVTAGKVFIKGKDGQFVENQNKEVAEPYSVTAYNDFDDSGFINPKTFTPYGKFYYAKNANGTSQVVYCFNVDLHSPPDSLDKGETIDPDFNEGKEIKYTHILGADLFSYANNPRASTNDELLSQVKKVLEKGYRDDSTTYANLTSVEFRAATQLAIYYFTDSVDLDNLADYHGFGALTTEALNATKEIVAYAEDRANLPNISNLDFYVPNSNKYQSLIGTQYHPESLVDIIRMEDKQAPIIPITHKLTISKTVTGTIADKKKEFNFEIHLKSSDGQAISGTYPTNSGELTVTDGKATFTLKDGESLIVEGLPSGYSYEITETGASDYEVSVNGKNAPDGKATKASVKEDETITFENRKDLVPPTGLTTDGAIYLWLLLLVLLGLWVWLIGRKGLKND

SpyM18_(—)0126 contains an amino acid motif indicative of a cell wallanchor: SEQ II ID NO: 184 VPPTG (shown in italics in SEQ ID NO: 73,above). In some recombinant host cell systems, it may be preferable toremove this motif to facilitate secretion of a recombinantSpyM18_(—)0126 protein from the host cell. Alternatively, in otherrecombinant host cell systems, it may be preferable to use the cell wallanchor motif to anchor the recombinantly expressed protein to the cellwall. The extracellular domain of the expressed protein may be cleavedduring purification or the recombinant protein may be left attached toeither inactivated host cells or cell membranes in the finalcomposition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in SpyM18_(—)0126. The pilin motifsequence is underlined in SEQ ID NO: 73, below. Conserved lysine (K)residues are also marked in bold, at amino acid residues 172 and 179.The pilin sequence, in particular the conserved lysine residues, arethought to be important for the formation of oligomeric, pilus-likestructures. Preferred fragments of SpyM18_(—)0126 include at least oneconserved lysine residue. Preferably, fragments include the pilinsequence. SEQ ID NO: 73MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEQSTETKKTSVIIRKYAEGDYSKLLEGATLKLAQIEGSGFQEQSFESSTSGQKLQLSDGTYILTETKSPQGYEIAEPITFKVTAGKVFIKGKDGQFVENQNKEVAEPYSVTAYNDFDDSGFINPKTFTPYGK FYYAKNANGTSQVVYCFNVDLHSPPDSLDKGETIDPDFNEGKEIKYTHILGADLFSYANNPRASTNDELLSQVKKVLEKGYRDDSTTYANLTSVEFRAATQLAIYYFTDSVDLDNLADYHGFGALTTEALNATKEIVAYAEDRANLPNISNLDFYVPNSNKYQSLIGTQYHPESLVDIIRMEDKQAPIIPITHKLTISKTVTGTIADKKKEFNFEIHLKSSDGQAISGTYPTNSGELTVTDGKATFTLKDGESLIVEGLPSGYSYEITETGASDYEVSVNGKNAPDGKATKASVKEDETITFENRKDLVPPTGLTTDGAIYLWLLLLVLLGLWVWLIGRKGLKND

Three E boxes containing conserved glutamic residues have beenidentified in SpyM18_(—)0126. The E-box motifs are underlined in SEQ IDNO: 73, below. The conserved glutamic acid (E) residues, at amino acidresidues 112, 257, and 415, are marked in bold. The E box motifs, inparticular the conserved glutamic acid residues, are thought to beimportant for the formation of oligomeric pilus-like structures ofSpyM18_(—)0126. Preferred fragments of SpyM18_(—)0126 include at leastone conserved glutamic acid residue. Preferably, fragments include atleast one E box motif. SEQ ID NO: 73MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEQSTETKKTSVIIRKYAEGDYSKLLEGATLKLAQIEGSGFQEQSFESSTSGQKLQLSDGTYILTETKSPQGYEIAEPITFKVTAGKVFIKGKDGQFVENQNKEVAEPYSVTAYNDFDDSGFINPKTFTPYGKFYYAKNANGTSQVVYCFNVDLHSPPDSLDKGETIDPDFNEGKEIKYTHILGADLFSYANNPRASTNDELLSQVKKVLEKGYRDDSTTYANLTSVEFRAATQLAIYYFTDSVDLDNLADYHGFGALTTEALNATKEIVAYAEDRANLPNISNLDFYVPNSNKYQSLIGTQYHPESLVDIIRMEDKQAPIIPITHKLTISKTVTGTIADKKKEFNFEIHLKSSDGQAISGTYPTNSGELTVTDGKATFTLKDGESLIVEGLPSGYSYEITETGASDYEVSVNGKNAPDGKATKASVKEDETITFENRKDLVPPTGLTTDGAIYLWLLLLVLLGLWVWLIGRKGLKND

SpyM18_(—)0127 is a LepA protein. An example of SpyM18_(—)0127 is shownin SEQ ID NO: 74. SEQ ID NO: 74MTNYLNRLNENPLEKAFIRLVLKISIIGFLGYILFQYIFGVMIINTNVMSPALSAGDGILYYRLTDRYHINDVVVYEVDNTLKVGRIVAQAGDEVSFTQEGGLLINGHPPEKEVPYLTYPHSSGPNFPYKVPTGTYFILNDYREERLDSRYYGALPINQIKGKISTLLRVRGI

SpyM18_(—)0128 is thought to be a fimbrial protein. An example ofSypM18_(—)0128 is shown in SEQ ID NO: 75. SEQ ID NO: 75MKKNKLLLATAILATALGTASLNQNVKAETAGVIDGSTLVVKKTFPSYTDDKVLMPKADYTFKVEADDNAKGKTKDGLDIKPGVTDGLENTKTIHYGNSDKTTAKEKSVNFDFANVKFPGVGVYRYTVSEVNGNKAGIAYDSQQWTVDVYVVNREDGGFEAKYIVSTEGGQSDKKPVLFKNFFDTTSLKVTKKVTGNTGEHQRSFSFTLLLTPNECFEKGQVVNILQGGETKKVVIGEEYSFTLKDKESVTLSQLPVGIEYKVTEEDVTKDGYKTSATLKDGDVTDGYNLGDSKTTDKSTDEIVVTNKRDTQVPTGVVGTLAPFAVLSIVATGGVIYITKRKKA

SpyM18_(—)0128 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 140 QVPTG (shown in italics in SEQ ID NO: 75, above).In some recombinant host cell systems, it may be preferable to removethis motif to facilitate secretion of a recombinant SpyM18_(—)0128protein from the host cell. Alternatively, in other recombinant hostcell systems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in SpyM18_(—)0128. The pilin motifsequence is underlined in SEQ ID NO: 75, below. A conserved lysine (K)residue is also marked in bold, at amino acid residue 57. The pilinsequence, in particular the conserved lysine residue, is thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of SpyM18_(—)0128 include the conserved lysineresidue. Preferably, fragments include at least one pilin sequence. SEQID NO: 75 MKKNKLLLATAILATALGTASLNQNVKAETAGVIDGSTLVVKKTFPSYTDDKVLMPKADYTFKVEADDNAKGKTKDGLDIKPGVIDGLENTKTIHYGNSDKTTAKEKSVNFDFANVKFPGVGVYRYTVSEVNGNKAGIAYDSQQWTVDVYVVNREDGGFEAKYIVSTEGGQSDKKPVLFKNFFDTTSLKVTKKVTGNTGEHQRSFSFTLLLTPNECFEKGQVVNILQGGETKKVVIGEEYSFTLKDKESVTLSQLPVGIEYKVTEEDVTKDGYKTSATLKDGDVTDGYNLGDSKTTDKSTDEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

An E box containing a conserved glutamic residue has been identified inSpyM18_(—)0128. The E-box motif is underlined in SEQ ID NO: 75, below.The conserved glutamic acid (E), at amino acid residue 266, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of SpyM18_(—)0128. Preferred fragments ofSpyM18_(—)0128 include the conserved glutamic acid residue. Preferably,fragments include the E box motif. SEQ ID NO: 75MKKNKLLLATAILATALGTASLNQNVKAETAGVIDGSTLVVKKTFPSYTDDKVLMPKADYTFKVEADDNAKGKTKDGLDIKPGVIDGLENTKTIHYGNSDKTTAKEKSVNFDFANVKFPGVGVYRYTVSEVNGNKAGIAYDSQQWTVDVYVVNREDGGFEAKYIVSTEGGQSDKKPVLFKNFFDTTSLKVTKKVTGNTGEHQRSFSFTLLLTPNECFEKGQVVNILQGGETKKVVIGEEYSFTLKDKESVTLSQLPVGIEYKVTEEDVTKDGYKTSATLKDGDVTDGYNLGDSKTTDKSTDEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

SpyM18_(—)0129 is a SrtC2 type sortase. An example of SpyM18_(—)0129 isshown in SEQ ID NO: 76 SEQ ID NO: 76MISQRMMMTIVQVINKAIDTLILIFCLVVLFLAGFGLWDSYHLYQQADASNFKKFKTAQQQPKFEDLLALNEDVIGWLNIPGTHMDYPLVQGKTNLEYINKAVDGSVAMSGSLFLDTRNHNDFTDDYSLIYGHHMAGNAMFGEIPKFLKKDFFNKHNKAIIETKERKKLTVTIFACLKTDAFDQLVFNPNAITNQDQQRQLVDYISKRSKQFKPVKLKHHTKFVAFSTCENFSTDNRVIVVGTIQE

SpyM18_(—)0130 is referred to as a hypothetical protein. An example ofSpyM18_(—)0130 is shown in SEQ ID NO: 77. SEQ ID NO: 77MRKYWKMLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTSFSVALESIDAMKTIDEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPDIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSRL

SpyM18_(—)0130 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 185 LPLAG (shown in italics in SEQ ID NO: 77, above).In some recombinant host cell systems, it may be preferable to removethis motif to facilitate secretion of a recombinant SpyM18_(—)0130protein from the host cell. Alternatively, in other recombinant hostcell systems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in SpyM18_(—)0130. The pilin motifsequence is underlined in SEQ ID NO: 77, below. Conserved lysine (K)residues are also marked in bold, at amino acid residues 144, 159, and169. The pilin sequence, in particular the conserved lysine residues,are thought to be important for the formation of oligomeric, pilus-likestructures. Preferred fragments of SpyM18_(—)0130 include at least oneconserved lysine residue. Preferably, fragments include the pilinsequence. SEQ ID NO: 77MRKYWKMLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTSFSVALESIDAMKTIDEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPI PPRQPDIPKTPLPLAGEVKSLLGTLSIVLLGLLVLLYVKKLKSRL

An E box containing a conserved glutamic residue has been identified inSpyM18_(—)0130. The E-box motif is underlined in SEQ ID NO: 77, below.The conserved glutamic acid (E), at amino acid residue 134, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of SpyM18_(—)0130. Preferred fragments ofSpyM18_(—)0130 include the conserved glutamic acid residue. Preferably,fragments include the E box motif. SEQ ID NO: 77MRKYWKMLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTSFSVALESIDAMKTIDEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPDIPKTPLPLAGEVKSLLGTLSIVLLGLLVLLYVKKLKSRL

SpyM18_(—)0131 is referred to as a putative multiple sugar metabolismregulator. An example of SpyM18_(—)0131 is set forth in SEQ ID NO: 78.SEQ ID NO: 78 MAIFDLKHVQTLHSLSQLPISVMSQDKALIQVYGNDDYLLCYYQFLKHLAIPQAAQDVIFYEGLFEESFMIFPLCHYIIAIGPFYPYSLNKDYQEQLANNCLKHSSHRSKEELLSYMALVPHFPINNVRNLLIAIDAFFDTQFETTCQQTIHQLLQHSKQMTADPDIIHRLKHISKASSQLPPVLEHLNHIMDLVKLGNPQLLKQEINRIPLSSITSSSISALRAEKNLTVIYLTRLLEFSFVENTDVAKHYSLVKYYMALNEEASDLLKVLRIRCAAIIHFSESLTNKSISDKRQMYNSVLHYVDSHLYSKLKVSDTAKRLYVSESHLRSVFKKYSNVSLQHYTLSTKTKEAQLLLKRGIPVGEVAKSLYFYDTTHFHKIFKKYTGISSKDYLAKYRDN I

SpyM18_(—)0132 is a F2 like fibronectic-binding protein. An example ofSpyM18_(—)0132 is set forth in SEQ ID NO: 79. SEQ ID NO: 79MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGAFEIKKNKSQEEYNYEVYDNRNILQDGEHKLEIKRVDGTGKTYQGFCFQLTKNFPTAQGVSKKLYKKLSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVISVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDITEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEPGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGQIVETTEDTQTGMHGDSGCKTEVEDTKLVQSFHFDNKESESNSEIPKKDKPKSNTSLPATGEKQHNMFFWMVTSCSLISSVPVISLKTKKRLSSC

SpyM18_(—)0132 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 180 LPATG (shown in italics in SEQ ID NO: 79, above).In some recombinant host cell systems, it may be preferable to removethis motif to facilitate secretion of a recombinant SpyM18_(—)0132protein from the host cell. Alternatively, in other recombinant hostcell systems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in SpyM18_(—)0132. The pilin motifsequence is underlined in SEQ ID NO: 79, below. A conserved lysine (K)residue is also marked in bold, at amino acid residue 270. The pilinsequence, in particular the conserved lysine residue, is thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of SpyM18_(—)0132 include the conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO: 79MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGAFEIKKNKSQEEYNYEVYDNRNILQDGEHKLEIKRVDGTGKTYQGFCFQLTKNFPTAQGVSKKLYKKLSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVISVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDITEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEPGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGQIVETTEDTQTGMHGDSGCKTEVEDTKLVQSFHFDNKESESNSEIPKKDKPKSNTSLPATGEKQHNMFFWMVTSCSLISSVPVISLKTKKRLSSC

An E box containing a conserved glutamic residue has been identified inSpyM18_(—)0132. The E-box motif is underlined in SEQ ID NO: 79, below.The conserved glutamic acid (E), at amino acid residue 516, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of SpyM18_(—)0132. Preferred fragments ofSpyM18_(—)0132 include the conserved glutamic acid residue. Preferably,fragments include the E box motif. SEQ ID NO: 79MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGAFEIKKNKSQEEYNYEVYDNRNILQDGEHKLEIKRVDGTGKTYQGFCFQLTKNFPTAQGVSKKLYKKLSSSDEETLKQYASKYTSNRRGDTSGNLKKQIAKVLTEGYPTNKSDWLNGLTENEKIEVTQDAIWYFTETTVPADRSYTNRNVNSQKMKEVYQKLIDTTDIDKYEDVQFDLFVPQDTNLQAVISVEPVIESLPWTSLKPIAQKDITAKKIWVDAPKEKPIIYFKLYRQLPGEKEVAVDDAELKQINSEGQQEISVTWTNQLVTDEKGMAYIYSVKEVDKNGELLEPKDYIKKEDGLTVTNTYVKPTSGHYDIEVTFGNGHIDITEDTTPDIVSGENQMKQIEGEDSKPIDEVTENNLIEPGKNTMPGEEDGTNSNKYEEVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDVIIGGQGQIVETTEDTQTGMHGDSGCKTEVEDTKLVQSFHFDNKESESNSEIPKKDKPKSNTSLPATGEKQHNMFFWMVTSCSLISSVPVISLKTKKRLSSC

Examples of GAS AI-3 sequences from M49 strain isolate 591 are set forthbelow.

SpyoM01000156 is a negative transcriptional regulator (Nra). An exampleof SpyoM01000156 is set forth in SEQ ID NO: 243. SEQ ID NO: 243MPYVKKKKDSFLVETYLEQSIRDKSELVLLLFKSPTIIFSHVAKQTGLTAVQLKYYCKELDDEFGNNLDITIKKGKIIGCFVKPVKEFYLHQLYDTSTILKLLVFFIKNGTSSQPLIKFSKKYFLSSSSAYRLRESLIKLLREFGLRVSKNTIVGEEYRIRYLIAMLYSKFGIVIYPLDHLDNQIIYRFLSQSATNLRTSPWLEEPFSFYNMLLALSWKRHQFAVSIPQTRIFRQLKKLFIYDCLTRSSRQVIENAFSLTFSQGDLDYLELIYITTNNSEASLQWTPQHIETCCHIFEKNDTERLLLEPILKRLPQLNHSKQDLIKALMYFSKSFLFNLQHEVIEIPSFSLPTYTGNSNLYKALKNIVNQWLAQLPGKRHLNEKHLQLFCSHIEQILKNKQPALTVVLISSNFINAKLLTDTIPRYFSDKGIHFYSFYLLRDDTYQIPSLKPDLVITHSRLTPFVKNDLVKGVTVAEFSFDNPDYSIASIQNLIYQLKDK KYQDFLNEQLQ

SpyoM01000155 is thought to be a collagen binding protein (CPA). Anexample of SpyoM01000155 is set forth in SEQ ID NO: 244. SEQ ID NO: 244MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGESIRAFGAEEQSVPNRQSSIQDYPWYGYDSYPKGYPDYSPLKTYHNLKVNLEGSKDYQAYCFNLTKHFPSKSDSVRSQWYKKLEGTNENFIKLADKPRIEDGQLQQNILRILYNGYPNNRNGIMKGIDPLNAILVTQNAIWYYTDSAQINPDESFKTEARSNGINDQQLGLMRKALKELIDPNLGSKYSNKTPSGYRLNVFESHDKTFQNLLSAEYVPDTPPKPGEEPPAKTEKTSVIIRKYAEGDYSKLLEGATLKLSQIEGSGFQEKDFQSNSLGETVELPNGTYTLTETSSPDGYKIAEPIKFRVENKKVETVQKDGSQVENPNKEVAEPYSVEAYNDFMDEEVLSGFTPYGKFYYAKNKDKSSQVVYCPNADLHSPPDSYDSGETTNPDTSTMKEVKYTHTAGSDLFKYALRPRDTNPEDFLKHIKKVIEKGYKKKGDSYNGLTETQFRAATQLAIYYFTDSADLKTLKTYNNGKGYHGFESMDEKTLAVTKELITYAQNGSAPQLTNLDFFVPNNSKYQSLIGTEYHPDDLVDVTRMEDKKQEVIPVTHSLTVKKTVVGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEGLPTGYSYTLKETEAKDYIVTVDNKVSQEAQSVGKDITEDKKVTFENRKDLVPPTGLTTDGAIYLWLLLLVPLGLLVWLFGRKGLKND

SpyoM01000155 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 184 VPPTG (shown in italics in SEQ ID NO: 244,above). In some recombinant host cell systems, it may be preferable toremove this motif to facilitate secretion of a recombinant SpyoM1000155protein from the host cell. Alternatively, in other recombinant hostcell systems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

Two pilin motifs, discussed above, containing conserved lysine (K)residues have also been identified in SpyoM01000155. The pilin motifsequence is underlined in SEQ ID NO: 244, below. Conserved lysine (K)residues are also marked in bold, at amino acid residues 71 and 261. Thepilin sequences, in particular the conserved lysine residues, arethought to be important for the formation of oligomeric, pilus-likestructures. Preferred fragments of SpyoM01000155 include at least oneconserved lysine residue. Preferably, fragments include at least onepilin sequence. SEQ ID NO: 244MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGFSIRAFGAEEQS VPNRQSSIQDYPWYGYDSYPKGYPDYSPLKTYHNLKVNLEGSKDYQAYCFNLTKHFPSKSDSVRSQWYKKLEGTNENFIKLADKPRIEDGQLQQNILRILYNGYPNNRNGIMKGIDPLNAILVTQNAIWYYTDSAQINPDESFKTEARSNGINDQQLGLMRKALKELIDPNLGSKYSNKTPSGYRLNVFESHDKTFQNLL SAEYVPDTPPKPGEEPPAKTEKTSVIIRKYAEGDYSKLLEGATLKLSQIEGSGFQEKDFQSNSLGETVELPNGTYTLTETSSPDGYKIAEPIKFRVENKKVFIVQKDGSQVENPNKEVAEPYSVEAYNDFMDEEVLSGFTPYGKFYYAKNKDKSSQVVYCFNADLHSPPDSYDSGETINPDTSTMKEVKYTHTAGSDLFKYALRPRDTNPEDFLKHIKKVIEKGYKKKGDSYNGLTETQFRAATQLAIYYFTDSADLKTLKTYNNGKGYHGFESMDEKTLAVTKELITYAQNGSAPQLTNLDFFVPNNSKYQSLIGTEYHPDDLVDVIRMEDKKQEVIPVTHSLTVKKTVVGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEGLPTGYSYTLKETEAKDYIVTVDNKVSQEAQSVGKDITEDKKVTFENRKDLVPPTGLTTDGAIYLWLLLLVPLGLLVWLFGRKGLKND

Two E boxes containing conserved glutamic residues have been identifiedin SpyoM1000155. The E-box motifs are underlined in SEQ ID NO: 244,below. The conserved glutamic acid (E) residues, at amino acid residues329 and 668, are marked in bold. The E box motifs, in particular theconserved glutamic acid residues, are thought to be important for theformation of oligomeric pilus-like structures of SpyoM01000155.Preferred fragments of SpyoM01000155 include at least one conservedglutamic acid residue. Preferably, fragments include at least one E boxmotif. SEQ ID NO: 244 MQKRDKTNYGSANNKRRQTTIGLLKVFLTFVALIGIVGESIRAFGAEEQSVPNRQSSIQDYPWYGYDSYPKGYPDYSPLKTYHNLKVNLEGSKDYQAYCFNLTKHFPSKSDSVRSQWYKKLEGTNENFIKLADKPRIEDGQLQQNILRILYNGYPNNRNGIMKGIDPLNAILVTQNAIWYYTDSAQINPDESFKTEARSNGINDQQLGLMRKALKELIDPNLGSKYSNKTPSGYRLNVFESHDKTFQNLLSAEYVPDTPPKPGEEPPAKTEKTSVIIRKYAEGDYSKLLEGATLKLSQIEGSGFQEKDFQSNSLGETVELPNGTYTLTETSSPDGYKIAEPIKFRVENKKVETVQKDGSQVENPNKEVAEPYSVEAYNDFMDEEVLSGFTPYGKFYYAKNKDKSSQVVYCPNADLHSPPDSYDSGETTNPDTSTMKEVKYTHTAGSDLFKYALRPRDTNPEDFLKHIKKVIEKGYKKKGDSYNGLTETQFRAATQLAIYYFTDSADLKTLKTYNNGKGYHGFESMDEKTLAVTKELITYAQNGSAPQLTNLDFFVPNNSKYQSLIGTEYHPDDLVDVTRMEDKKQEVIPVTHSLTVKKTVVGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEGLPTGYSYTLKETEAKDYIVTVDNKVSQEAQSVGKDITEDKKVTFENRKDLVPPTGLTTDGAIYLWLLLLVPLGLLVWLFGRKGLKND

SpyoM01000154 is a LepA protein. An example of SpyoM01000154 is shown inSEQ ID NO: 245. SEQ ID NO: 245MTNYLNRLNENSLFKAFIRLVLKISTIGFLGYILFQYVFGVMIINTNDMSPALSAGDGVLYYRLADRSHINDVVVYEVDNTLKVGRIAAQAGDEVNFTQEGGLLINGHPPEKEVPYLTYPHSSGPNFPYKVPTGTYFILNDYREERLDSRYYGALPINQIKGKISTLLRVRGI

SpyoM01000153 is thought to be a fimbrial protein. An example ofSpyoM01000153 is shown in SEQ ID NO: 246. SEQ ID NO: 246MKKNKLLLATAILATALGMASMSQNIKAETAGVIDGSTLVVKKTFPSYTDDNVLMPKADYSFKVEADDNAKGKTKDGLDIKPGVIDGLENTKTIRYSNSDKITAKEKSVNFEFANVKFPGVGVYRYTVAEVNGNKAGITYDSQQWTVDVYVVNKEGGGFEVKYIVSTEVGQSEKKPVLFKNSFDTTSLKIEKQVTGNTGEHQRLFSFTLLLTPNECFEKGQVVNILQGGETKKVVIGEEYSFTLKDKESVTLSQLPVGIEYKLTEEDVTKDGYKTSATLKDGEQSSTYELGKDHKTDKSADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

SpyoM01000153 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 140 QVPTG (shown in italics in SEQ ID NO: 246,above). In some recombinant host cell systems, it may be preferable toremove this motif to facilitate secretion of a recombinant SpyoM01000153protein from the host cell. Alternatively, in other recombinant hostcell systems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in SpyoM01000153. The pilin motifsequence is underlined in SEQ ID NO: 246, below. A conserved lysine (K)residue is also marked in bold, at amino acid residue 57. The pilinsequence, in particular the conserved lysine residue, is thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of SpyoM01000153 include the conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO:246 MKKNKLLLATAILATALGMASMSQNIKAETAGVIDGSTLVVKKTFPSYTD DNVLMPKADYSFKVEADDNAKGKTKDGLDIKPGVIDGLENTKTIRYSNSDKITAKEKSVNFEFANVKFPGVGVYRYTVAEVNGNKAGITYDSQQWTVDVYVVNKEGGGEEVKYIVSTEVGQSEKKPVLFKNSFDTTSLKIEKQVTGNTGEHQRLFSFTLLLTPNECFEKGQVVNILQGGETKKVVIGEEYSFTLKDKESVTLSQLPVGIEYKLTEEDVTKDGYKTSATLKDGEQSSTYELGKDHKTDKSADEIVVTNKRDTQVPTGVVGTLAPEAVLSIVAIGGVIYITKRKKA

An E box containing a conserved glutamic residue has been identified inSpyoM01000153. The E-box motif is underlined in SEQ ID NO: 246, below.The conserved glutamic acid (E), at amino acid residue 265, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of SpyoM01000153. Preferred fragments ofSpyoM01000153 include the conserved glutamic acid residue. Preferably,fragments include the E box motif. SEQ ID NO: 246MKKNKLLLATAILATALGMASMSQNIKAETAGVIDGSTLVVKKTFPSYTDDNVLMPKADYSFKVEADDNAKGKTKDGLDIKPGVIDGLENTKTIRYSNSDKITAKEKSVNFEFANVKFPGVGVYRYTVAEVNGNKAGITYDSQQWTVDVYVVNKEGGGFEVKYIVSTEVGQSEKKPVLFKNSFDTTSLKIEKQVTGNTGEHQRLFSFTLLLTPNECFEKGQVVNILQGGETKKVVIGEEYSFTLKDKESVTLSQLPVGIEYKLTEEDVTKDGYKTSATLKDGEQSSTYELGKDHKTDKSADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

SpyoM01000152 is a SrtC2 type sortase. An example of SpyoM01000152 isshown in SEQ ID NO: 247 SEQ ID NO: 247MMMTIVQVINKAIDTLILIFCLVVLFLAGFGLWDSYHLYQQADASNFKKFKTAQQQPKFEDLLALNEDVIGWLNIPGTHIDYPLVQGKTNLEYINKAVDGSVAMSGSLFLDTRNHNDFTDDYSLIYGHHMAGNAMFGEIPKFLKKNFFNKHNKAIIETKERKKLTVTIFACLKTDAFDQLVFNPNAITNQDQQRQLVDYISKRSKQFKPVKLKHHTKFVAPSTCENFSTDNRVTVVGTIQE

SpyoM01000151 is referred to as a hypothetical protein. An example ofSpyoM01000151 is shown in SEQ ID NO: 248. SEQ ID NO: 248MLFSVVMMLTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPDIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSRL

SpyoM01000151 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 185 LPLAG (shown in italics in SEQ ID NO: 248,above). In some recombinant host cell systems, it may be preferable toremove this motif to facilitate secretion of a recombinant SpyoM01000151protein from the host cell. Alternatively, in other recombinant hostcell systems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in SpyoM01000151. The pilin motifsequence is underlined in SEQ ID NO: 248, below. Conserved lysine (K)residues are also marked in bold, at amino acid residue 138. The pilinsequence, in particular the conserved lysine residue, is thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of SpyoM01000151 include the conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO:248 MLFSVVMMLTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPK RLVKPIPPRQPDIPKTPLPLAGEVKSLLGILSTVLLGLLVLLYVKKLKSRL

Two E boxes containing conserved glutamic residues have been identifiedin SpyoM01000151. The E-box motifs are underlined in SEQ ID NO: 248,below. The conserved glutamic acid (E) residues, at amino acid residues58 and 128, are marked in bold. The E box motifs, in particular theconserved glutamic acid residues, are thought to be important for theformation of oligomeric pilus-like structures of SpyoM01000151.Preferred fragments of SpyoM01000151 include at least one conservedglutamic acid residue. Preferably, fragments include at least one E boxmotif. SEQ ID NO: 248 MLFSVVMMLTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPDIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSRL

SpyoM01000150 is referred to as a putative MsmRL. An example ofSpyoM01000150 is set forth in SEQ ID NO: 249. SEQ ID NO: 249MVIFDLKHVQTLHSLSQLPISVMSQDKALIQVYGNDDYLLCYYQFLKHLAIPQAAQDVIFYEGLFEESFMIFPLCHYIIAIGPFYPYSLNKDYQEQLANNFLKHSSHRSKEELLSYMALVPHFPINNVRNLLIAIDAFFDTQFETTCQQTTHQLLQHSKQMTADPDIIHRLKHISKASSQLPPVLEHLNHIMDLVKLGNPQLLKQEINRIPLSSITSSSISALRAEKNLTVIYLTRLLEFSFVENTDVAKHYSLVKYYMALNEEASDLLKVLRIRCAAIIHFSESLTNKSISDKRQMYNSVLHYVDSHLYSKLKVSDIAKRLYVSESHLRSVFKKYSNVSLQHYILSTKIKEAQLLLKRGIPVGEVAKSLYFYDTTHFHKIFKKYTGISSKDYLAKYRDN I

SpyoM01000149 is a F2 like fibronectin-binding protein. An example ofSpyoM01000149 is set forth in SEQ ID NO: 250. SEQ ID NO: 250MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGYFEIKKVDQNNKPLSGATFSLTPKDGKGKPVQTFTSSEEGIIDAQNLQPGTYTLKEETAPDGYDKTSRTWTVTVYENGYTKLVENPYNGEIISKAGSKDVSSSLQLENPKMSVVSKYGEQEKTSNSADFYRNHAAYFKMSFELKQKDKSETINPGDTFVLQLDRRLNPKGISQDIPKIIYDSENSPLAIGKYDAKTHQLTYTETNYIAGLDKVQLSAELSLFLENKEVLENTNISDFKSTIGGQEITYKGTVNVLYGNESTKESNYITNGLSNVGGSIESYNTETGEFVWYVYVNPNRTNIPYAVLNLWGFAKRTAQGENDNSSVSSAQLTGYDIYEVPHNYRLPTSYGVDISRLNLRKDLEAKLPQGSTQGANKRLRIDFGENLQGKAFVVKVTGKADQSGKELIVQSHLSSPNNWGSYKTLRPNSHVSETNEIALSPSKGSGSGTSEETKPAITVANLKRVAQLRFKKVSTDNVPLPEAAFELRSSNGNSQKLEASSNTQGEIHPKDLTSGTYDLYETKAPKGYQQVTEKLATVTVDTTKPAEQMVKWEKPHSFVKVEANKEVTIVNHKETLTFSGKKIWENDRPDQRPAKIQVQLLQNGQKMPNQIQEVTKDNDWSYHFKDLPKYDAKNQEYKYSVEEVKVPDGYKVSYLGNDIFNTRETEFVFEQNNENLEFGNAEIKGQSGSKIIDEDTLTSFKGKKIWKNDTAENRPQAIQVQLYADGVAVEGQTKFISGSGNEWSFEFKNLKKYNGTGNDIIYSVKEVTVPTGYDVTYSANDIINTKREVITQQGPNLEIEETLPLESGASGGTTTVEDSRSVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKPSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGQVVETTEDTQTGMHGDSGCKTEVEDTKLVQFFHFDNKEPESNSEIPKKDKPKSNTSLPATGEKQHNKFFWMVTSCSLISSVFVIS LKSKKRLLSC

SpyoM01000149 contains an amino acid motif indicative of a cell wallanchor: SEQ ID NO: 180 LPATG (shown in italics in SEQ ID NO: 250,above). In some recombinant host cell systems, it may be preferable toremove this motif to facilitate secretion of a recombinant SpyoM01000149protein from the host cell. Alternatively, in other recombinant hostcell systems, it may be preferable to use the cell wall anchor motif toanchor the recombinantly expressed protein to the cell wall. Theextracellular domain of the expressed protein may be cleaved duringpurification or the recombinant protein may be left attached to eitherinactivated host cells or cell membranes in the final composition.

Two pilin motifs, discussed above, containing conserved lysine (K)residues have also been identified in SpyoM01000149. The pilin motifsequences are underlined in SEQ ID NO: 250, below. Conserved lysine (K)residues are also marked in bold, at amino acid residues 157 and 163,and 216 and 224. The pilin sequences, in particular the conserved lysineresidues, are thought to be important for the formation of oligomeric,pilus-like structures. Preferred fragments of SpyoM01000149 include atleast one conserved lysine residue. Preferably, fragments include atleast one pilin sequence. SEQ ID NO: 250MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGYFEIKKVDQNNKPLSGATFSLTPKDGKGKPVQTFTSSEEGIIDAQNLQPGTYTLKEETAPDGYDKTSRTWTVTVYENGYTKLVENPYNGEIISKAGSKDVSSS LQLENPKMSVVSKYGEQEKTSNSADFYRNHAAYFKMSFELKQKDKSETIN PGDTFVLQLDRRLNP KGISQDIPKIIYDSENSPLAIGKYDAKTHQLTYTETNYIAGLDKVQLSAELSLFLENKEVLENTNISDFKSTIGGQEITYKGTVNVLYGNESTKESNYITNGLSNVGGSIESYNTETGEFVWYVYVNPNRTNIPYAVLNLWGFAKRTAQGENDNSSVSSAQLTGYDIYEVPHNYRLPTSYGVDISRLNLRKDLEAKLPQGSTQGANKRLRIDFGENLQGKAFVVKVTGKADQSGKELIVQSHLSSPNNWGSYKTLRPNSHVSETNEIALSPSKGSGSGTSEETKPAITVANLKRVAQLRFKKVSTDNVPLPEAAFELRSSNGNSQKLEASSNTQGEIHPKDLTSGTYDLYETKAPKGYQQVTEKLATVTVDTTKPAEQMVKWEKPHSFVKVEANKEVTIVNHKETLTFSGKKIWENDRPDQRPAKIQVQLLQNGQKMPNQIQEVTKDNDWSYHFKDLPKYDAKNQEYKYSVEEVKVPDGYKVSYLGNDIFNTRETEFVFEQNNENLEFGNAEIKGQSGSKIIDEDTLTSFKGKKIWKNDTAENRPQAIQVQLYADGVAVEGQTKFISGSGNEWSFEFKNLKKYNGTGNDIIYSVKEVTVPTGYDVTYSANDIINTKREVITQQGPNLEIEETLPLESGASGGTTTVEDSRSVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKPSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGQVVETTEDTQTGMHGDSGCKTEVEDTKLVQFFHFDNKEPESNSEIPKKDKPKSNTSLPATGEKQHNKFFWMVTSCSLISSVFVIS LKSKKRLLSC

Two E boxes containing conserved glutamic residues have been identifiedin SpyoM01000149. The E-box motifs are underlined in SEQ ID NO: 250,below. The conserved glutamic acid (E) residues, at amino acid residues329 and 668, are marked in bold. The E box motifs, in particular theconserved glutamic acid residues, are thought to be important for theformation of oligomeric pilus-like structures of SpyoM01000149.Preferred fragments of SpyoM01000149 include at least one conservedglutamic acid residue. Preferably, fragments include at least one E boxmotif. SEQ ID NO: 250 MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGYFEIKKVDQNNKPLSGATFSLTPKDGKGKPVQTFTSSEEGIIDAQNLQPGTYTLKEETAPDGYDKTSRTWTVTVYENGYTKLVENPYNGEIISKAGSKDVSSSLQLENPKMSVVSKYGEQEKTSNSADFYRNHAAYFKMSFELKQKDKSETINPGDTFVLQLDRRLNPKGISQDIPKIIYDSENSPLAIGKYDAKTHQLTYTETNYIAGLDKVQLSAELSLFLENKEVLENTNISDFKSTIGGQEITYKGTVNVLYGNESTKESNYITNGLSNVGGSIESYNTETGEFVWYVYVNPNRTNIPYAVLNLWGFAKRTAQGENDNSSVSSAQLTGYDIYEVPHNYRLPTSYGVDISRLNLRKDLEAKLPQGSTQGANKRLRIDFGENLQGKAFVVKVTGKADQSGKELIVQSHLSSPNNWGSYKTLRPNSHVSETNEIALSPSKGSGSGTSEETKPAITVANLKRVAQLRFKKVSTDNVPLPEAAFELRSSNGNSQKLEASSNTQGEIHPKDLTSGTYDLYETKAPKGYQQVTEKLATVTVDTTKPAEQMVKWEKPHSFVKVEANKEVTIVNHKETLTFSGKKIWENDRPDQRPAKIQVQLLQNGQKMPNQIQEVTKDNDWSYHFKDLPKYDAKNQEYKYSVEEVKVPDGYKVSYLGNDIFNTRETEFVFEQNNENLEFGNAEIKGQSGSKIIDEDTLTSFKGKKIWKNDTAENRPQAIQVQLYADGVAVEGQTKFISGSGNEWSFEFKNLKKYNGTGNDIIYSVKEVTVPTGYDVTYSANDIINTKREVITQQGPNLEIEETLPLESGASGGTTTVEDSRSVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDAHIVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKPSDVIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGQVVETTEDTQTGMHGDSGCKTEVEDTKLVQFFHFDNKEPESNSEIPKKDKPKSNTSLPATGEKQHNKFFWMVTSCSLISSVFVIS LKSKKRLLSC

As discussed above, applicants have also determined the nucleotide andencoded amino acid sequence of fimbrial structural subunits in severalother GAS AI-3 strains of bacteria. Examples of sequences of thesefimbrial structural subunits are set forth below.

M3 strain isolate ISS 3040 is a GAS AI-3 strain of bacteria.ISS3040_fimbrial is thought to be a fimbrial structural subunit of M3strain isolate ISS 3040. An example of a nucleotide sequence encodingthe ISS3040_fimbrial protein (SEQ ID NO: 263) and an ISS3040_fimbrialprotein amino acid sequence (SEQ ID NO: 264) are set forth below. SEQ IDNO: 263 gagacggcaggagtgtccgaaaatgcaaaattaatagtaaaaaagacatttgactcttatacagacaatgaagttttaatgccaaaagctgattatacttttaaagtagaggcagatagtacagctagtggcaaaacgaaagacggtttagagattaagccaggtattgttaatggtttaacagaacagattaccagctatactaatactgataaaccagatagtaaagttaaaagtacagagtttgatttttcaaaagtagtattccctggtattggtgtttaccgctatactgtttcagaaaaacaaggtgatgttgaaggaattacctacgatactaagaagtggacagtagatgtttatgttggaaacaaagaaggtggtggttttgaacctaagtttattgtatctaaggaacaaggaacagacgtcaaaaaaccagttaattttaacaactcgtttgcaactacttcgttaaaagttaagaagaatgtatcggggaatactggagaattgcaaaaagaatttgactttacattgacgcttaatgaaagcacgaattttaaaaaagatcaaattgtttctttacaaaaaggaaacgagaaatttgaagttaagattggtactccctacaagtttaaactcaaaaatggggaatctattcaactagacaagttaccagttggtattacttataaagtcaatgaaatggaagctaataaagatgggtataaaacaacagcatccttgaaagagggagatggtcaatctaaaatgtatcaattggatatggaacaaaaaacagacgaatctgctgacgaaatcgttgtcacaaataagcgtgacactcaagttccaactggtgttgtaggcacccttgctccatttgcagttcttagc SEQ ID NO: 264ETAGVSENAKLIVKKTFDSYTDNEVLMPKADYTFKVEADSTASGKTKDGLEIKPGIVNGLTEQIISYTNTDKPDSKVKSTEFDFSKVVFPGIGVYRTVSEKQGDVEGITYDTKKWTVDVYVGNKEGGGFEPKFIVSKEQGTDVKKPVNFNNSFATTSLKVKKNVSGNTGELQKEFDFTLTLNESTNFKKDQIVSLQKGNEKFEVKIGTPYKFKLKNGESIQLDKLPVGITYKVNEMEANKDGYKTTASLKEGDGQSKMYQLDMEQKTDESADEIVVTNKRDTQVPTGVVGTLAPFAVLS

M44 strain isolate ISS 3776 is a GAS AI-3 strain of bacteria.ISS3776_fimbrial is thought to be a fimbrial structural subunit of M44isolate ISS 3776. An example of a nucleotide sequence encoding theISS3776_fimbrial protein (SEQ ID NO: 253) and an ISS3776_fimbrialprotein amino acid sequence (SEQ ID NO: 254) are set forth below. SEQ IDNO: 253 ttggagagagaaaaaatgaaaaaaaacaaattattacttgctactgcaatcttagcaactgctttaggaacagcttctttaaatcaaaacgtaaaagctgagacggcaggggttgtaacaggaaaatcactacaagttacaaagacaatgacttatgatgatgaagaggtgttaatgcccgaaaccgcctttacttttactatagagcctgatatgactgcaagtggaaaagaaggcagcctagatattaaaaatggaattgtagaaggcttagacaaacaagtaacagtaaaatataagaatacagataaaacatctcaaaaaactaaaatagcacaatttgatttttctaaggttaaatttccagctataggtgtttaccgctatatggtttcagagaaaaacgataaaaaagacggaattacgtacgatgataaaaagtggactgtagatgtttatgttgggaataaggccaataacgaagaaggtttcgaagttctatatattgtatcaaaagaaggtacttctagtactaaaaaaccaattgaatttacaaactctattaaaactacttccttaaaaattgaaaaacaaataactggcaatgcaggagatcgtaaaaaatcattcaacttcacattaacattacaaccaagtgaatattataaaactggatcagttgtgaaaatcgaacaggatggaagtaaaaaagatgtgacgataggaacgccttacaaatttactttgggacacggtaagagtgtcatgttatcgaaattaccaattggtatcaattactatcttagtgaagacgaagcgaataaagacggctacactacaacggcaacattaaaagaacaaggcaaagaaaagagttccgatttcactttgagtactcaaaaccagaaaacagacgaatctgctgacgaaatcgttgtcacaaataagcgtgacactcaagttccaactggtgttgtagggacccttgctccatttgcagttcttagcattgtggctattggtggagttatctatattacaaaacgtaaa aaagcttaa SEQ ID NO:254 MEREKMKKNKLLLATAILATALGTASLNQNVKAETAGVVTGKSLQVTKTMTYDDEEVLMPETAFTFTIEPDMTASGKEGSLDIKNGIVEGLDKQVTVKYKNTDKTSQKTKIAQFDFSKVKFPAIGVYRYMVSEKNDKKDGTTYDDKKWTVDVYVGNKANNEEGFEVLYIVSKEGTSSTKKPIEFTNSIKTTSLKIEKQITGNAGDRKKSFNFTLTLQPSEYYKTGSVVKIEQDGSKKDVTIGTPYKFTLGHGKSVMLSKLPIGINYYLSEDEANKDGYTTTATLKEQGKEKSSDFTLSTQNQKTDESADEIVVTNKRDTQVPTGVVGTLAPFAVLSIVAIGGVIYTTKRK KA

M77 strain isolate ISS4959 is a GAS AI-3 strain of bacteria.ISS4959_fimbrial is thought to be a fimbrial structural subunit of M77strain ISS 4959. An example of a nucleotide sequence encoding theISS4959_fimbrial protein (SEQ ID NO: 271) and an ISS4959_fimbrialprotein amino acid sequence (SEQ ID NO: 272) are set forth below. SEQ IDNO: 271 gtaacagtaaaatataagaatacagataaaacatctcaaaaaactaaaatagcacaatttgatttttctaaggttaaatttccagctataggtgtttaccgctatatggtttcagagaaaaacgataaaaaagacggaattacgtacgatgataaaaagtggacngtagatgtttatgttgggaataaggccaataacgaagaaggtttcgaagttctatatattgtatcaaaagaaggtacttctagtnctaaaaaaccaattgaatttacaaactctattaaaactacttccttaaaaattgaaaaacaaataactggcaatgcaggagatcgtaaaaaatcattcaacttcacattnacattacanccaagtgaatattataaaactggatcagttgtgaaaatcgaacaggatggaagtaaaaaagatgtgacgataggaacgccttacaaatttactttgggacacggtaagagtgtcatgttatcgaaattnccaattggtatcaattactatcttagtgaagacgaagcgaataaagacggntacactacancggcaacattaaaagaacaaggcaaagaaaagagttccgatttcactttgagtactcaaaaccagaaaacagacgaatctgctg SEQ ID NO: 272VTVKYKNTDKTSQKTKIAQFDFSKVKFPAIGVYRYMVSEKNDKKDGITYDDKKWTVDVYVGNKANNEEGREVLYIVSKEGTSSXKKPIEFTNSIKTTSLKIEKQITGNAGDRKKSFNFTXTLXPSEYYKTGSVVKIEQDGSKKDVTIGTPYKFTLGHGKSVNLSKXPIGINYYLSEDEANKDGYTTXATLKEQGKEKSSD FTLSTQNQKTDESA

Examples of GAS AI-4 sequences from M12 strain isolate A735 are setforth below.

19224133 is thought to be a RofA regulatory protein. An example of anucleotide sequence encoding the RofA regulatory protein (SEQ ID NO:104) and a RofA regulatory protein amino acid sequence (SEQ ID NO: 105)are set forth below. SEQ ID NO: 104ATGACCATCCAAAAAAGGATGATATCTTGCCAATTTACACATCCTTCTAAAGAAACTTATCTTTACCAACTCTATGCATCATCTAATGTCTTACAATTACTAGCGTTTTTAATAAAAAATGGTTCCCACTCTCGTCCCCTTACGGATTTTGCAAGAAGTCATTTTTTATCAAACTCCTCAGCTTATCGGATGCGCGAAGCATTGATTCCTTTATTAAGAAACTTTGAATTAAAACTCTCTAAGAACAAGATTGTCGGTGAGGAATATCGTATCCGTTACCTCATCGCTCTGCTATATAGTAAGTTTGGCATTAAAGTTTATGACTTGACGCAGCAAGACAAAAACATTATTCATAGCTTTTTATCCCATAGTTCCACCCACCTTAAAACTTCTCCTTGGTTATCGGAATCGTTTTCTTTCTATGACATTTTATTAGCTTTATCGTGGAAGCGGCATCAATTTTCGGTAACTATTCCCCAAACCAGAATTTTTCAACAATTAAAAAAACTTTTTGTCTACGATTCTTTGAAAAAAAGTAGCCGTGATATTATCGAAACTTACTGCCAACTAAACTTTTCAGCAGGAGATTTGGACTACCTCTATTTAATTTATATCACCGCTAATAATTCTTTTGCGAGCTTACAATGGACACCTGAGCATATCAGACAATGTTGTCAACTTTTTGAAGAAAATGATACTTTTCGCCTGCTTTTAAATCCTATCATCACTCTTTTACCTAACCTAAAAGAGCAAAAGGCTAGTTTAGTAAAAGCTCTTATGTTTTTTTCAAAATCATTCTTGTTTAATCTGCAACATTTTATTCCTACAGATTCTTTCCCAAGGTATTTCTCGGATAAAAGCATTGATTTTCATTCCTATTATCTATTGCAAGATAATGTTTATCAAATTCCTGATTTAAAGCCAGATTTGGTCATCACTCACAGTCAACTGATTCCTTTTGTTCACCATGAACTTACAAAAGGAATTGCTGTTGCTGAAATATCTTTTGATGAATCGATTCTGTCTATCCAAGAATTGATGTATCAAGTTAAAGAGGAAAAATTCCAAGCTGATTTAACCAAACAATTAACATAA SEQ ID NO: 105MTIQKRMISCQFTHPSKETYLYQLYASSNVLQLLAFLIKNGSHSRPLTDFARSHFLSNSSAYRMREALIPLLRNFELKLSKNKIVGEEYRIRYLIALLYSKFGIKVYDLTQQDKNIIHSFLSHSSTHLKTSPWLSESFSFYDILLALSWKRHQFSVTIPQTRIFQQLKKLFVYDSLKKSSRDIIETYCQLNFSAGDLDYLYLIYITANNSFASLQWTPEHIRQCCQLFEENDTFRLLLNPIITLLPNLKEQKASLVKALMFFSKSFLFNLQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILRNIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITHSQLIPFVHHELTKGTAVAEISFDESTLSIQELMYQVKEEKFQADLTKQLT

19224134 is thought to be a protein F fibronectin binding protein. Anexample of a nucleotide sequence encoding the protein F fibronectinbinding protein (SEQ ID NO: 106) and a protein F fibronectin bindingprotein amino acid sequence (SEQ ID NO: 107) are set forth below. SEQ IDNO: 106 ATGGTAAGCTCATATATGTTTGCGAGAGGAGAGAAAATGAATAACAAAATGTTTTTGAACAAAGAAGCCGGTTTTTTGGTACACACAAAAAGAAAAAGGCGATTTGCTGTCACTTTAGTGGGAGTCTTTTTTCTGCTTTTGGCATGTGCGGGTGCTATCGGTTTTGGTCAAGTAGCCTATGCTGCGGATGAGAAGACTGTGCCGAATTTTAAAAGCCCAGATCCAGATTATCCCTGGTATGGTTATGATTCGTATAGAGGAATATTTGCAAGATATCACAATTTAAAAGTAAATCTAAAAGGAAGTAAGGAGTATCAAGCGTATTGTTTTAACCTAACAAAATACTTTCCTCGCCCCACTTATAGTACTACAAATAATTTTTACAAGAAAATTGATGGGAGTGGATCAGCGTTCAAATCTTATGCAGCGAATCCTAGGGTTTTAGATGAGAATTTAGATAAATTAGAAAAAAATATACTGAATGTAATTTATAATGGATATAAAAGTAATGCAAATGGTTTTATGAATGGTATAGAAGATCTTAATGCTATACTAGTAACTCAAAACGCTATTTGGTACTATTCAGATAGTGCTCCATTAAATGATGTTAATAAAATGTGGGAAAGAGAGGTTCGGAATGGGGAGATTAGTGAGTCACAAGTTACTTTAATGCGTGAGGCATTGAAAAAACTAATTGATCCCAATTTAGAAGCTACTGCAGCTAATAAAATCCCATCAGGATATCGTTTAAATATCTTTAAGTCTGAAAATGAAGATTACCAAAATCTTTTAAGTGCTGAATATGTACCTGATGATCCCCCTAAACCTGGTGATACGTCAGAACATAATCCTAAAACTCCCGAGTTGGATGGCACTCCAATTCCCGAGGACCCAAAACGTCCAGATGAGAGTTCAGAACCTGCGCTTCCCCCATTAATGCCAGAGCTAGATGGTGAAGAAGTCCCAGAAGTTCCAAGCGAGAGCTTAGAACCTGCGCTTCCCCCATTGATGCCAGAGCTAGATGGTGAAGAAGTCCCAGAAGTTCCAAGCGAGAGCTTAGAACCTGCGCTTCCCCCATTGATGCCAGAGCTAGATGGTGAAGAAGTCCCAGAAGTTCCAAGCGAGAGCTTAGAACCTGCGCTTCCCCCATTAATGCCAGAGCTAGATGGTGAAGAAGTCCCAGAAGTTCCAAGCGAGAGCTTAGAACCTGCGCTTCCCCCATTGATGCCAGAGTTAGATGGTGAAGAAGTCCCTGAAAAACCTAGTGTTGACTTACCTATTGAAGTTCCTCGTTATGAGTTTAACAATAAAGACCAGTCACCTCTAGCGGGTGAGTCTGGTGAGACGGAGTATATTACCGAAGTCTATGGAAATCAACAGAACCCTGTTGATATTGATAAAAAACTTCCGAATGAAACAGGTTTTTCAGGAAATATGGTTGAGACAGAAGATACGAAAGAGCCAGAAGTGTTGATGGGAGGTCAAAGTGAGTCTGTTGAATTTACTAAAGACACTCAAACAGGCATGAGTGGTCAAACAACTCCTCAGGTTGAGACAGAAGATACGAAAGAGCCAGAAGTGTTGATGGGAGGTCAAAGTGAGTCTGTTGAATTTACTAAAGACACTCAAACAGGCATGAGTGGTCAAACAACTCCTCAGGTTGAGACAGAAGATACGAAAGAGCCAGGAGTGTTGATGGGAGGCCAAAGTGAGTCTGTTGAATTTACTAAAGACACTCAAACAGGCATGAGTGGTCAAACAACTCCTCAGGTTGAGACAGAAGACACGAAAGAGCCAGGAAATCGGGAAAAGCCTACAAAAAATATAACACCTATCCTTCCTGCAACAGGAGATATTGAGAATGTTTTGGCCTTTCTTGGAATCCTTATTTTGTCAGTACTTTCTATTTTTAGCCTTTTAAAAACAAACAAAACAATAAAGTCTGA SEQ ID NO: 107MVSSYMFARGEKMNNKMFLNKEAGFLVHTKRKRRFAVTLVGVFFLLLACAGAIGFGQVAYAADEKTVPNFKSPDPDYPWYGYDSYRGIFARYHNLKVNLKGSKEYQAYCFNLTKYFPRPTYSTTNNFYKKIDGSGSAFKSYAANPRVLDENLDKLEKNILNVIYNGYKSNANGFMNGIEDLNAILVTQNAIWYYSDSAPLNDVNKMWEREVRNGEISESQVTLMREALKKLTDPNLEATAANKIPSGYRLNIFKSENEDYQNLLSAEYVPDDPPKPGDTSEHNPKTPELDGTPIPEDPKRPDESSEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEKPSVDLPIEVPRYEFNNKDQSPLAGESGETEYITEVYGNQQNPVDIDKKLPNETGFSGNMVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPEVLMGGQSESVEETKDTQTGMSGQTTPQVETEDTKEPGVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPGVLMGGQSESVEFTKDTQTGMSFFSETVTIVEDTRPKLVFHFDNNEPKVEENREKPTKNITPILPATGDIENVLAFLGILILSVLSIFSLLKNKQNNKV

19224134 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 181 LPATG (shown in italics in SEQ ID NO: 107, above). Insome recombinant host cell systems, it may be preferable to remove thismotif to facilitate secretion of a recombinant 19224134 protein from thehost cell. Alternatively, in other recombinant host cell systems, it maybe preferable to use the cell wall anchor motif to anchor therecombinantly expressed protein to the cell wall. The extracellulardomain of the expressed protein may be cleaved during purification orthe recombinant protein may be left attached to either inactivated hostcells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in 19224134. The pilin motif sequenceis underlined in SEQ ID NO: 107, below. Conserved lysine (K) residuesare also marked in bold, at amino acid residues 275, 285, and 299. Thepilin sequence, in particular the conserved lysine residues, are thoughtto be important for the formation of oligomeric, pilus-like structures.Preferred fragments of 19224134 include at least one conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO:107 MVSSYMFARGEKMNNKMFLNKEAGFLVHTKRKRRFAVTLVGVFFLLLACAGAIGFGQVAYAADEKTVPNFKSPDPDYPWYGYDSYRGIFARYHNLKVNLKGSKEYQAYCFNLTKYFPRPTYSTTNNFYKKIDGSGSAFKSYAANPRVLDENLDKLEKNTLNVIYNGYKSNANGFMNGIEDLNAILVTQNAIWYYSDSAPLNDVNKMWEREVRNGEISESQVTLMREALKKLIDPNLEATAANKIPSGYRLNIFKSENEDYQNLLSAEYVPDDPPKPGDTSEHNPKTPELDGTPIPEDPK RPDESSEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEKPSVDLPIEVPRYEFNNKDQSPLAGESGETEYITEVYGNQQNPVDIDKKLPNETGFSGNMVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPGVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPGVLMGGQSESVEFTKDTQTGMSGFSETVTIVEDTRPKLVEHFDNNEPKVEENREKPTKNITPILPATGDIENVLAFLGILILSVLSIPSLLKNKQNNKV

Two E boxes containing conserved glutamic residues have been identifiedin 19224134. The E-box motifs are underlined in SEQ ID NO: 107, below.The conserved glutamic acid (E) residues, at amino acid residues 487 and524, are marked in bold. The E box motifs, in particular the conservedglutamic acid residues, are thought to be important for the formation ofoligomeric pilus-like structures of 19224134. Preferred fragments of19224134 include at least one conserved glutamic acid residue.Preferably, fragments include at least one E box motif. SEQ ID NO: 107MVSSYMFARGEKMNNKMFLNKEAGFLVHTKRKRRFAVTLVGVFFLLLACAGAIGFGQVAYAADEKTVPNFKSPDPDYPWYGYDSYRGIFARYHNLKVNLKGSKEYQAYCFNLTKYFPRPTYSTTNNFYKKIDGSGSAFKSYAANPRVLDENLDKLEKNTLNVIYNGYKSNANGFMNGIEDLNAILVTQNAIWYYSDSAPLNDVNKMWEREVRNGEISESQVTLMREALKKLIDPNLEATAANKIPSGYRLNIFKSENEDYQNLLSAEYVPDDPPKPGDTSEHNPKTPELDGTPIPEDPKRPDESSEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEKPSVDLPIEVPRYEFNNKDQSPLAGESGETEYITEVYGNQQNPVDIDKKLPNETGFSGNMVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPEVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPGVLMGGQSESVEFTKDTQTGMSGQTTPQVETEDTKEPGVLMGGQSESVEFTKDTQTGMSGFSETVTIVEDTRPKLVEHFDNNEPKVEENREKPTKNITPILPATGDIENVLAFLGILILSVLSIPSLLKNKQNNKV

19224135 is thought to be a capsular polysaccharide adhesin (Cpa)protein. An example of a nucleotide sequence encoding the Cpa protein(SEQ ID NO: 108) and a Cpa protein amino acid sequence (SEQ ID NO: 109)are set forth below. SEQ ID NO: 108ATGAATAACAAAAAATTGCAAAAGAAGCAAGATGCTCCTCGGGTATCAAACAGAAAGCCAAAACAATTAACTGTCACTTTAGTGGGAGTATTTTTAATGTTTTTGACCTTGGTAAGTTCCATGAGAGGTGCTCAAAGCATATTTGGAGAGGAAAAGAGAATTGAAGAAGTCAGTGTTCCTAAAATAAAAAGTCCAGATGATGCCTACCCTTGGTATGGCTATGATTCATATGACTCTAGTCATCCTTACTATGAACGTTTTAAAGTAGCACATGATTTAAGGGTTAATTTAAATGGAAGTAAGAGCTACCAAGTATATTGCTTTAATATCAATTCTCATTATCCGAATAGAAAAAATGCTTTTTCTAAACAATGGTTTAAGAGAGTTGATGGGACAGGTGATGTGTTCACAAATTATGCTCAGACACCTAAGATTCGTGGAGAATCATTGAATAATAAACTTTTAAGTATTATGTACAACGCTTATCCTAAAAATGCTAATGGCTATATGGATAAGATAGAACCATTAAATGCTATTTTAGTAACTCAACAAGCTGTTTGGTACTATTCTGACAGTTCTTATGGTAATATAAAAACGTTATGGGCATCTGAGCTTAAAGACGGAAAAATAGATTTTGAACAAGTAAAATTAATGCGTGAAGCTTACTCAAAACTAATTAGTGATGATTTAGAAGAAACATCTAAAAATAAGCTACCTCAAGGATCTAAACTGAATATTTTTGTTCCGCAAGATAAATCTGTTCAAAATTTATTAAGTGCAGAGTACGTGCCTGAATCCCCTCCGGCACCAGGTCAGTCTCCAGAACCGCCAGTGCAAACAAAAAAAACATCAGTCATTATCAGAAAATATGCGGAAGGTGACTACTCTAAACTTCTAGAGGGAGCAACTTTGCGTTTAACAGGGGAAGATATCCTAGATTTTCAAGAAAAAGTCTTCCAAAGTAATGGAACAGGAGAAAAGATTGAATTATCAAATGGGACTTATACCTTAACAGAAACATCATCTCCAGATGGATATAAAATTGCGGAGCCGATTAAGTTTAGAGTAGTGAATAAAAAAGTATTTATCGTCCAAAAAGATGGTTCTCAAGTGGAAAATCCAAACAAAGAAGTAGCAGAGCCATACTCAGTGGAAGCGTACAGCGATATGCAAGATAGTAACTATATTAATCCAGAAACGTTCACTCCTTATGGGAAATTTTATTACGCTAAAAATAAGGATAAAAGTTCACAAGTTGTCTACTGTTTTAATGCTGATTTACACTCTCCACCTGAATCAGAGGATGGGGGAGGAACTATAGATCCTGATATTAGTACGATGAAAGAAGTCAAGTACACACATACGGCAGGTAGTGATTTGTTTAAATACGCGCTAAGACCGAGAGATACAAATCCAGAAGACTTCTTAAAGCACATTAAAAAAGTAATTGAAAAAGGCTACAATAAAAAAGGTGATAGCTATAATGGATTAACAGAAACACAGTTTCGCGCGGCTACTCAGCTTGCTATCTATTACTTTACAGACAGCACTGACTTAAAAACCTTAAAAACTTATAACAATGGGAAAGGTTACCATGGATTTGAATCTATGGATGAAAAAACCCTAGCTGTAACAAAAGAATTAATTAATTACGCTCAAGATAATAGTGCCCCTCAACTAACAAATCTTGATTTCTTCGTACCTAATAATAGCAAATACCAATCTCTTATTGGGACAGAATACCATCCAGATGATTTGGTTGACGTGATTCGTATGGAAGATAAAAAGCAAGAAGTTATTCCAGTAACTCACAGTTTGACAGTGAAAAAAACAGTAGTCGGTGAGTTGGGAGATAAAACTAAAGGCTTCCAATTTGAACTTGAGTTGAAAGATAAAACTGGACAGCCTATTGTTAACACTCTAAAAACTAATAATCAAGATTTAGTAGCTAAAGATGGGAAATATTCATTTAATCTAAAGCATGGTGACACCATAAGAATAGAAGGATTACCGACGGGATATTCTTATACTCTGAAAGAGACTGAAGCTAAGGATTATATAGTAACCGTTGATAACAAAGTTAGTCAAGAAGCTCAATCAGCAAGTGAGAATGTCACAGCAGACAAAGAAGTCACTTTTGAAAACCGTAAAGATCTTGTCCCACCAACTGGTTTTATTACTGATGGTGGAACCTATGTGTGGTTATTATTGCTTGTCGCATTTGGTTTGTTAGTGTGGTTCTTTGGTGGT AAAGGACTAAAAAATGACTAASEQ ID NO: 109 MNNKKLQKKQDAPRVSNRKPKQLTVTLVGVFLMELTLVSSMRGAQSIFGEEKRIEEVSVPKIKSPDDAYPWYGYDSYDSSHPYYERFKVAHDLRVNLNGSKSYQVYCFNINSHYPNRKNAFSKQWFKRVDGTGDVFTNYAQTPKIRGESLNNKLLSIMYNAYPKNANGYMDKIEPLNAILVTQQAVWYYSDSSYGNIKTLWASELKDGKIDFEQVKLMREAYSKLISDDLEETSKNKLPQGSKLNIFVPQDKSVQNLLSAEYVPESPPAPGQSPEPPVQTKKTSVIIRKYAEGDYSKLLEGATLRLTGEDILDFQEKVFQSNGTGEKIELSNGTYTLTETSSPDGYKIAEPIKFRVVNKKVFIVQKDGSQVENPNKEVAEPYSVEAYSDMQDSNYINPETFTPYGKFYYAKNKDKSSQVVYCFNADLHSPPESEDGGGTIDPDISTMKEVKYTHTAGSDLFKYALRPRDTNPEDFLKHIKKVTEKGYNKKGDSYNGLTETQFRAATQLAIYYFTDSTDLKTLKTYNNGKGYHGFESMDEKTLAVTKELINYAQDNSAPQLTNLDFFVPNNSKYQSLIGTEYHPDDLVDVIRMEDKKQEVIPVTHSLTVKKTVVGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEGLPTGYSYTLKETEAKDYIVTVDNKVSQEAQSASENVTADKEVTFENRKDLVPPTGFITDGGTYLWLLLLVPFGLLVWFFGR KGLKND

19224135 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 184 VPPTG (shown in italics in SEQ ID NO: 109, above). Insome recombinant host cell systems, it may be preferable to remove thismotif to facilitate secretion of a recombinant 19224135 protein from thehost cell. Alternatively, in other recombinant host cell systems, it maybe preferable to use the cell wall anchor motif to anchor therecombinantly expressed protein to the cell wall. The extracellulardomain of the expressed protein may be cleaved during purification orthe recombinant protein may be left attached to either inactivated hostcells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in 19224135. The pilin motif sequenceis underlined in SEQ ID NO: 109, below. Conserved lysine (K) residuesare also marked in bold, at amino acid residues 164 and 172. The pilinsequence, in particular the conserved lysine residues, are thought to beimportant for the formation of oligomeric, pilus-like structures.Preferred fragments of 19224135 include at least one conserved lysineresidue. Preferably, fragments include the pilin sequence. SEQ ID NO:109 MNNKKLQKKQDAPRVSNRKPKQLTVTLVGVFLMELTLVSSMRGAQSIFGEEKRIEEVSVPKIKSPDDAYPWYGYDSYDSSHPYYERFKVAHDLRVNLNGSKSYQVYCFNINSHYPNRKNAFSKQWFKRVDGTGDVFTNYAQTPKIRGESLNNKLLSIMYNAYPKNANGYMDK IEPLNAILVTQQAVWYYSDSSYGNIKTLWASELKDGKIDFEQVKLMREAYSKLISDDLEETSKNKLPQGSKLNIFVPQDKSVQNLLSAEYVPESPPAPGQSPEPPVQTKKTSVIIRKYAEGDYSKLLEGATLRLTGEDILDFQEKVFQSNGTGEKIELSNGTYTLTETSSPDGYKIAEPIKFRVVNKKVFIVQKDGSQVENPNKEVAEPYSVEAYSDMQDSNYINPETFTPYGKFYYAKNKDKSSQVVYCFNADLHSPPESEDGGGTIDPDISTMKEVKYTHTAGSDLFKYALRPRDTNPEDFLKHIKKVTEKGYNKKGDSYNGLTETQFRAATQLAIYYFTDSTDLKTLKTYNNGKGYHGFESMDEKTLAVTKELINYAQDNSAPQLTNLDFFVPNNSKYQSLIGTEYHPDDLVDVIRMEDKKQEVIPVTHSLTVKKTVVGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEGLPTGYSYTLKETEAKDYIVTVDNKVSQEAQSASENVTADKEVTFENRKDLVPPTGFITDGGTYLWLLLLVPFGLLVWFFGR KGLKND

An E box containing a conserved glutamic residue has been identified in19224135. The E-box motif is underlined in SEQ ID NO: 109, below. Theconserved glutamic acid (E), at amino acid residue 339, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of 19224135. Preferred fragments of 19224135include the conserved glutamic acid residue. Preferably, fragmentsinclude the E box motif. SEQ ID NO: 109MNNKKLQKKQDAPRVSNRKPKQLTVTLVGVFLMELTLVSSMRGAQSIFGEEKRIEEVSVPKIKSPDDAYPWYGYDSYDSSHPYYERFKVAHDLRVNLNGSKSYQVYCFNINSHYPNRKNAFSKQWFKRVDGTGDVFTNYAQTPKIRGESLNNKLLSIMYNAYPKNANGYMDKIEPLNAILVTQQAVWYYSDSSYGNIKTLWASELKDGKIDFEQVKLMREAYSKLISDDLEETSKNKLPQGSKLNIFVPQDKSVQNLLSAEYVPESPPAPGQSPEPPVQTKKTSVIIRKYAEGDYSKLLEGATLRLTGEDILDFQEKVFQSNGTGEKIELSNGTYTLTETSSPDGYKIAEPIKFRVVNKKVFIVQKDGSQVENPNKEVAEPYSVEAYSDMQDSNYINPETFTPYGKFYYAKNKDKSSQVVYCFNADLHSPPESEDGGGTIDPDISTMKEVKYTHTAGSDLFKYALRPRDTNPEDFLKHIKKVTEKGYNKKGDSYNGLTETQFRAATQLAIYYFTDSTDLKTLKTYNNGKGYHGFESMDEKTLAVTKELINYAQDNSAPQLTNLDFFVPNNSKYQSLIGTEYHPDDLVDVIRMEDKKQEVIPVTHSLTVKKTVVGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEGLPTGYSYTLKETEAKDYIVTVDNKVSQEAQSASENVTADKEVTFENRKDLVPPTGFITDGGTYLWLLLLVPFGLLVWFFGR KGLKND

19224136 is thought to be a LepA protein. An example of a nucleotidesequence encoding the LepA protein (SEQ ID NO: 110) and a LepA proteinamino acid sequence (SEQ ID NO: 111) are set forth below. SEQ ID NO: 110ATGACTAATTACCTAAATCGCTTAAATGAGAATCCACTATTTAAAGCTTTCATACGGTTAGTACTTAAGATTTCTATTATTGGATTTCTAGGTTACATTCTATTTCAGTATGTTTTTGGCGTCATGATTGTTAACACAAATCAGATGAGTCCTGCTGTAAGTGCTGGTGATGGAGTCTTATATTATCGTTTGACTGATCGCTATCATATTAATGATGTGGTGGTCTATGAGGTTGATAACACTTTGAAAGTTGGTCGAATTGCCGCTCAAGCTGGCGATGAGGTTAGTTTTACGCAAGAAGGAGGACTGTTGATTAATGGGCATCCACCAGAAAAAGAGGTCCCTTACCTGACGTATCCTCACTCAAGTGGTCCAAACTTTCCCTATAAAGTTCCTACGGGTACGTATTTCATATTGAATGATTATCGTGAAGAACGTTTGGACAGTCGTTATTATGGGGCGTTACCCATCAATCAAATCAAAGGGAAAATCTCAACTCTATTAAGAGTGAGAGGAATTTAA SEQ ID NO: 111MTNYLNRLNENPLFKAFIRLVLKISIIGFLGYILFQYVFGVMIVNTNQMSPAVSAGDGVLYYRLTDRYHINDVVVYEVDNTLKVGRIAAQAGDEVSFTQEGGLLINGHPPEKEVPYLTYPHSSGPNFPYKVPTGTYFILNDYREERLDSRYYGALPINQIKGKISTLLRVRGI

19224137 is thought to be a fimbrial protein. An example of a nucleotidesequence encoding the fimbrial protein (SEQ ID NO: 112) and a fimbrialprotein amino acid sequence (SEQ ID NO: 113) are set forth below. SEQ IDNO: 112 ATGAAAAAAAATAAATTATTACTTGCTACTGCAATCTTAGCAACTGCTTTAGGAACAGCTTCTTTAAATCAAAACGTAAAAGCTGAGACGGCAGGGGTTGTTAGCAGTGGTCAATTAACAATAAAAAAATCAATTACAAATTTTAATGATGATACACTTTTGATGCCTAAGACAGACTATACTTTTAGCGTTAATCCGGATAGTGCGGCTACAGGTACTGAAAGTAATTTACCAATTAAACCAGGTATTGCTGTTAACAATCAAGATATTAAGGTTTCTTATTCTAATACTGATAAGACATCAGGTAAAGAAAAACAAGTTGTTGTTGACTTTATGAAAGTTACTTTTCCTAGCGTTGGTATTTACCGTTATGTTGTTACCGAGAATAAAGGGACAGCAGAAGGAGTTACATATGATGATACAAAATGGTTAGTTGACGTCTATGTTGGTAATAATGAAAAGGGAGGTCTTGAACCAAAGTATATTGTATCTAAAAAAGGAGATTCTGCTACTAAAGAACCAATCCAGTTTAATAATTCATTCGAAACAACGTCATTAAAAATTGAAAAGGAAGTTACTGGTAATACAGGAGATCATAAAAAAGCATTTACCTTTACATTAACATTGCAACCAAATGAATACTATGAGGCAAGTTCGGTTGTGAAAATTGAAGAGAACGGACAAACGAAAGATGTGAAAATTGGGGAGGCATATAAGTTTACTTGAACGAATAGTCAGAGTGTGATATTGTCTAAATTACCAGTTGGTATTAATTATAAAGTTGAAGAAGCAGAAGCTAATCAAGGTGGATATACTACAACAGCAACTTTAAAAGATGGAGAAAAGTTATCTACTTATAACTTAGGTCAGGAACATAAAACAGACAAGACTGCTGATGAAATCGTTGTCACAAATAACCGTGACACTCAAGTTCCAACTGGTGTTGTAGGCACCCTTGCTCCATTTGCAGTTCTTAGCATTGTGGCTATTGGTGGAGTTATCTATATTACAAAACGTAAAAAAGCTTAA SEQ ID NO: 113MKKNKLLLATAILATALGTASLNQNVKAETAGVVSSGQLTIKKSITNFNDDTLLMPKTDYTFSVNPDSAATGTESNLPIKPGIAVNNQDIKVSYSNTDKTSGKEKQVVVDFMKVTFPSVGIYRYVVTENKGTAEGVTYDDTKWLVDVYVGNNEKGGLEPKYIVSKKGDSATKEPIQFNNSFETTSLKIEKEVTGNTGDHKKAFTFTLTLQPNEYYEASSVVKIEENGQTKDVKIGEAYKFTLNDSQSVILSKLPVGINYKVEEAEANQGGYTTTATLKDGEKLSTYNLGQEHKTDKTADEIVVTNNRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

19224137 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 140 QVPTG (shown in italics in SEQ ID NO: 113, above). Insome recombinant host cell systems, it may be preferable to remove thismotif to facilitate secretion of a recombinant 19224137 protein from thehost cell. Alternatively, in other recombinant host cell systems, it maybe preferable to use the cell wall anchor motif to anchor therecombinantly expressed protein to the cell wall. The extracellulardomain of the expressed protein may be cleaved during purification orthe recombinant protein may be left attached to either inactivated hostcells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in 19224137. The pilin motif sequenceis underlined in SEQ ID NO: 113, below. A conserved lysine (K) residueis also marked in bold, at amino acid residue 160. The pilin sequence,in particular the conserved lysine residues, are thought to be importantfor the formation of oligomeric, pilus-like structures. Preferredfragments of 19224137 include the conserved lysine residue. Preferably,fragments include the pilin sequence. SEQ ID NO: 113MKKNKLLLATAILATALGTASLNQNVKAETAGVVSSGQLTIKKSITNFNDDTLLMPKTDYTFSVNPDSAATGTESNLPIKPGIAVNNQDIKVSYSNTDKTSGKEKQVVVDFMKVTFPSVGIYRYVVTENKGTAEGVTYDDTKWLVDVYVGNNEKGGLEPKYIVSKKGDSATKEPIQFNNSFETTSLKIEKEVTGNTGDHKKAFTFTLTLQPNEYYEASSVVKIEENGQTKDVKIGEAYKFTLNDSQSVILSKLPVGINYKVEEAEANQGGYTTTATLKDGEKLSTYNLGQEHKTDKTADEIVVTNNRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

An E box containing a conserved glutamic residue has been identified in19224137. The E-box motif is underlined in SEQ ID NO: 113, below. Theconserved glutamic acid (E), at amino acid residue 263, is marked inbold. The E box motif, in particular the conserved glutamic acidresidue, is thought to be important for the formation of oligomericpilus-like structures of 19224137. Preferred fragments of 19224137include the conserved glutamic acid residue. Preferably, fragmentsinclude the E box motif. SEQ ID NO: 113MKKNKLLLATAILATALGTASLNQNVKAETAGVVSSGQLTIKKSITNFNDDTLLMPKTDYTFSVNPDSAATGTESNLPIKPGIAVNNQDIKVSYSNTDKTSGKEKQVVVDFMKVTFPSVGIYRYVVTENKGTAEGVTYDDTKWLVDVYVGNNEKGGLEPKYIVSKKGDSATKEPIQFNNSFETTSLKIEKEVTGNTGDHKKAFTFTLTLQPNEYYEASSVVKIEENGQTKDVKIGEAYKFTLNDSQSVILSKLPVGINYKVEEAEANQGGYTTTATLKDGEKLSTYNLGQEHKTDKTADEIVVTNNRDTQVPTGVVGTLAPFAVLSIVAIGGVIYITKRKKA

19224138 is thought to be a SrtC2-type sortase. An example of anucleotide sequence encoding the SrtC2 sortase (SEQ ID NO: 114) and aSrtC2 sortase amino acid sequence (SEQ ID NO: 115) are set forth below.SEQ ID NO: 114 ATGATGATGACAATTGTACAGGTTATCAATAAAGCCATTGATACTCTCATTCTTATCTTTTGTTTAGTCGTACTATTTTTAGCTGGTTTTGGTTTGTGGGATTCTTATCATCTCTATCAACAAGCAGACGCTTCTAATTTCAAAAAATTTAAAACAGCTCAACAACAGCCTAAATTTGAAGACTTGTTAGCTTTGAATGAGGATGTCATTGGTTGGTTAAATATCCCGGGGACTCATATTGATTATCCTCTAGTTCAGGGAAAAACGAATTTAGAGTATATTAATAAAGCAGTTGATGGCAGTGTTGCCATGTCTGGTAGTTTATTTTTAGATACACGGAATCATAATGATTTTACGGACGATTACTCTCTGATTTATGGCCATCATATGGCAGGTAATGCCATGTTTGGCGAAATTCCAAAATTTTTAAAAAAGGATTTTTTCAACAAACATAATAAAGCTATCATTGAAACAAAAGAGAGAAAAAAACTAACCGTCACTATTTTTGCTTGTCTCAAGACAGATGCCTTTGACCAGTTAGTTTTTAATCCTAATGCTATTACCAATCAAGACCAACAAAGGCAGCTCGTTGATTATATCAGTAAAAGATCAAAACAATTTAAACCTGTTAAATTGAAGCATCATACAAAGTTCGTTGCTTTTTCAACGTGTGAAAATTTTTCTACTGACAATCGTGTTATCGTTGTCGGTACTATTCAAGAATAA SEQ ID NO: 115MMMTIVQVINKAIDTLILIFCLVVLFLAGFGLWDSYHLYQQADASNFKKFKTAQQQPKFEDLLALNEDVIGWLNIPGTHIDYPLVQGKTNLEYINKAVDGSVAMSGSLFLDTRNHNDFTDDYSLIYGHHMAGNAMFGEIPKFLKKDFFNKHNKAIIETKERKKLTVTIFACLKTDAFDQLVFNPNAITNQDQQRQLVDYISKRSKQFKPVKLKHHTKFVAFSTCENFSTDNRVIVVGTIQE

19224139 is an open reading frame that encodes a sortase substrate motifLPXAG shown in italics in SEQ ID NO: 117. An example of a nucleotidesequence of the open reading frame (SEQ ID NO: 116) and the amino acidsequence encoded by the open reading frame (SEQ ID NO: 117) are setforth below. SEQ ID NO: 116ATGTTATTTTCTGTCGTAATGATATTAACCATGCTGGCCTTTAATCAGACTGTTTTAGCAAAAGACAGCACTGTTCAAACTAGCATTAGTGTCGAAAATGTCTTAGAGAGAGCAGGCGATAGTACCCCATTTTCGATTGCATTAGAATCAATTGATGCGATGAAAACAATAGAAGAAATAACAATTGCTGGTTCTGGAAAAGCAAGCTTTTCCCCTCTGACCTTCACAACAGTTGGGCAATATACTTATCGTGTTTATCAGAAGCCTTCACAAAATAAAGATTATCAAGCAGATACTACTGTATTTGACGTTCTTGTCTATGTGACCTATGATGAAGATGGGACTCTAGTCGCAAAAGTTATTTCTCGAAGGGCTGGAGACGAAGAAAAATCAGCGATTACTTTTAAGCCCAAACGGTTAGTAAAACCAATACCGCCTAGACAACCTAACATCCCTAAAACCCCATTACCATTAGCTGGTGAAGTAAAAAGTTTATTGGGTATCTTAAGTATCGTATTACTGGGGTTACTAGTTCTTCTTTATGTTAAAA AACTGAAGAG SEQ ID NO:117 MLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSKL

19224139 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 185 LPLAG (shown in italics in SEQ ID NO: 117, above). Insome recombinant host cell systems, it may be preferable to remove thismotif to facilitate secretion of a recombinant 19224139 protein from thehost cell. Alternatively, in other recombinant host cell systems, it maybe preferable to use the cell wall anchor motif to anchor therecombinantly expressed protein to the cell wall. The extracellulardomain of the expressed protein may be cleaved during purification orthe recombinant protein may be left attached to either inactivated hostcells or cell membranes in the final composition.

A pilin motif, discussed above, containing a conserved lysine (K)residue has also been identified in 19224139. The pilin motif sequenceis underlined in SEQ ID NO: 117, below. A conserved lysine (K) residueis also marked in bold, at amino acid residue 138. The pilin sequence,in particular the conserved lysine residue, is thought to be importantfor the formation of oligomeric, pilus-like structures. Preferredfragments of 19224139 include the conserved lysine residue. Preferably,fragments include the pilin sequence. SEQ ID NO: 117MLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSKL

Two E boxes containing conserved glutamic residues have been identifiedin 19224139. The E-box motifs are underlined in SEQ ID NO: 117, below.The conserved glutamic acid (E) residues, at amino acid residues 58 and128, are marked in bold. The E box motifs, in particular the conservedglutamic acid residues, are thought to be important for the formation ofoligomeric pilus-like structures of 19224139. Preferred fragments of19224139 include at least one conserved glutamic acid residue.Preferably, fragments include at least one E box motif. SEQ ID NO: 117MLFSVVMILTMLAFNQTVLAKDSTVQTSISVENVLERAGDSTPFSIALESIDAMKTIEEITIAGSGKASFSPLTFTTVGQYTYRVYQKPSQNKDYQADTTVFDVLVYVTYDEDGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPNIPKTPLPLAGEVKSLLGILSIVLLGLLVLLYVKKLKSKL

19224140 is thought to be a MsmRL protein. An example of a nucleotidesequence encoding the MsmRL protein (SEQ ID NO: 118) and a MsmRL proteinamino acid sequence (SEQ ID NO: 119) are set forth below. SEQ ID NO: 118ATGGTTATATTCGATTTAAAACATGTGCAAACATTACACAGCTTGTCTCAATTACCTATTTCAGTGATGTCACAAGATAAGGCACTTATTCAAGTATATGGTAATGACGACTATTTATTATGTTACTATCAATTTTTAAAGCATCTAGCTATTCCTCAAGCTGCACAAGATGTTATTTTTTATGAGGGTTTATTTGAAGAGTCCTTTATGATTTTTCCTCTTTGTCACTACATTATTGCCATTGGACCTTTCTACCCTTATTCACTTAATAAAGACTATCAGGAACAATTAGCTAATAATTTTTTAAAACATTCTTCTCATCGTAGCAAAGAAGAGCTCTTATCCTATATGGCACTTGTCCCACATTTTCCAATTAATAATGTGCGGAACCTTTTGATAGCTATTGACGCTTTTTTTGACACACAATTTGAGACGACTTGCCAACAAACAATTCATCAATTGTTGCAGCATTCAAAACAGATGACTGCTGATCCTGATATCATTCATCGCCTTAAGCATATTAGCAAAGCATCTAGCCAACTACCGCCTGTTTTAGAGCACCTAAATCATATTATGGATCTGGTAAAGCTAGGCAATCCACAATTGCTCAAGCAAGAAATCAATCGCATCCCCTTATCAAGTATCACCTCATCTTCTATTTCTGCTCTAAGGGCGGAAAAGAACCTCACTGTTATCTATTTAACTAGGTTACTGGAATTCAGTTTTGTAGAAAATACTGACGTAGCAAAGCATTATAGCCTTGTCAAATACTACATGGCCTTAAATGAAGAAGCGAGTGACTTGCTCAAAGTTTTGAGAATTCGCTGTGCAGCCATCATCCATTTTTCCGAATCATTAACCAATAAAAGTATTTCTGATAAACGTCAAATGTACAATAGTGTGCTTCATTATGTCGATAGTCACCTGTATTCCAAATTAAAGGTATCTGATATCGCTAAGCGCCTATATGTTTCCGAATCTCACTTACGTTCAGTCTTTAAAAAATACTCAAATGTTTCCTTACAACATTATATTCTAAGTACAAAAATCAAAGAAGCTCAACTACTCTTAAAACGAGGAATTCCTGTTGGAGAAGTGGCTAAAAGCTTATATTTTTATGACACTACCCATTTTCATAAAATCTTTAAAAAATACACGGGTATTTCTTCAAAAGACTATCTTGCTAAATACCGAGATAAT ATTTAA SEQ ID NO: 119MVIFDLKHVQTLHSLSQLPISVMSQDKALIQVYGNDDYLLCYYQFLKHLATPQAAQDVIFYEGLFEESFMIFPLCHYIIAIGPFYPYSLNKDYQEQLANNFLKHSSHRSKEELLSYMALVPHFPINNVRNLLIAIDAFFDTQFETTCQQTIHQLLQHSKQMTADPDIIHRLKHISKASSQLPPVLEHLNHIMDLVKLGNPQLLKQEINRIPLSSITSSSISALRAEKNLTVIYLTRLLEFSFVENTDVAKHYSLVKYYMALNEEASDLLKVLRIRCAAIIHFSESLTNKSISDKRQMYNSVLHYVDSHLYSKLKVSDIAKRLYVSESHLRSVFKKYSNVSLQHYILSTKIKEAQLLLKRGIPVGEVAKSLYFYDTTHFHKIFKKYTGISSKDYLAKYRDN I

19224141 is thought to be a protein F2 fibronectin binding protein. Anexample of a nucleotide sequence encoding the protein F2 fibronectinbinding protein (SEQ ID NO: 120) and a protein F2 fibronectin bindingprotein amino acid sequence (SEQ ID NO: 121) are set forth below. SEQ IDNO: 120 ATGACACAAAAAAATAGCTATAAGTTAAGCTTCCTGTTATCCCTAACAGGATTTATTTTAGGTTTATTATTGGTTTTTATAGGATTGTCCGGAGTATCAGTAGGACATGCGGAAACAAGAAATGGAGCAAACAAACAAGGATCTTTTGAAATCAAGAAAGTCGAGCAAAAGAATAAGGCTTTACCGGGAGCAAGTTTTTCAGTGACATCAAAGGATGGCAAGGGAACATGTGTTCAAAGGTTGACTTGAAATGATAAAGGTATTGTAGATGGTCAAAATCTCGAACCAGGGACTTATAGCTTAAAAGAAGAAACAGCACCAGATGGTTATGATAAAACCAGCCGGAGTTGGACAGTGACTGTTTATGAGAACGGCTATAGCAAGTTGGTTGAAAATCCCTATAATGGGGAAATCATCAGTAAAGCAGGGTCAAAAGATGTTAGTAGTTCTTTACAGTTGGAAAATCCGAAAATGTCAGTTGTTTCTAAATATGGGAAAACAGAGGTTAGTAGTGGCGCAGCGGATTTCTAGCGGAAGGATGCCGCCTATTTTAAAATGTGTTTTGAGTTGAAACAAAAGGATAAATCTGAAACAATCAACCCAGGTGATACCTTTGTGTTACAGCTGGATAGACGTCTGAATCCTAAAGGTATCAGTCAAGATATCCCTAAAATCATTTACGACAGTGGAAATAGTCGGGTTGCGATTGGAAAATACCATGGTGAGAACCATCAACTTATCTATACTTTCACAGATTATATTGCGGGTTTAGATAAAGTCCAGTTGTCTGCAGAATTGAGCTTATTCCTAGAGAATAAGGAAGTGTTGGAAAATACTAGTATGTCAAATTTTAAGAGTAGCATAGGTGGGCAGGAGATCAGCTATAAAGGAACGGTTAATGTTCTTTATGGAAATGAGAGCACTAAAGAAAGCAATTATATTAGTAATGGATTGAGGAATGTGGGTGGGAGTATTGAAAGCTACAACACCGAAACGGGAGAATTTGTCTGGTATGTTTATGTCAATCCAAACCGTACCAATATTCCTTATGCGACGATGAATTTATGGGGATTTGGAAGGGCTCGTTCAAATACAAGCGACTTAGAAAACGACGCTAATACAAGTAGTGCTGAGCTTGGAGAGATTCAGGTCTATGAAGTACCTGAAGGAGAAAAATTACCATCAAGTTATGGGGTTGATGTTAGAAAAGTTACTTTAAGAACGGATATCACAGCAGGCCTAGGAAATGGTTTTCAAATGAGCAAACGTCAGCGAATTGACTTTGGAAATAATATCCAAAATAAAGCATTTATCATCAAAGTAACAGGGAAAACAGACCAATCTGGTAAGCCATTGGTTGTTCAATCCAATTTGGCAAGTTTTCGTGGTGCTTCTGAATATGCTGCTTTTACTCCAGTTGGAGGAAATGTCTACTTCCAAAACGAAATTGCCTTGTCTCCTTCTAAGGGTAGTGGTTCTGGGAAAAGTGAATTTACTAAGCCCTCTATTACAGTAGCAAATCTAAAACGAGTGGCTCAGCTTCGCTTTAAGAAAATGTCAACTGAGAATGTGCCATTGCCAGAAGCGGCTTTTGAGCTGCGTTCATCAAATGGTAATAGTCAGAAATTAGAAGCCAGTTCAAACACACAAGGAGAGGTTCACTTTAAGGACCTGACCTCGGGCACATATGACCTGTATGAAACAAAAGCGCCAAAAGGTTATCAGCAGGTGACAGAGAAATTGGCGACCGTTACTGTTGATACTACCAAACCTGCTGAGGAAATGGTCACTTGGGGAAGCCCACATTCGTCTGTAAAAGTAGAAGCTAACAAAGAAGTCACGATTGTCAACCATAAAGAAACCCTTACGTTTTCAGGGAAGAAAATTTGGGAGAATGACAGACCAGATCAACGCCCAGCAAAGATTCAAGTGCAACTGTTGCAAAATGGTCAAAAGATGCCTAACCAGATTCAAGAAGTAACGAAGGATAACGATTGGTCTTATCACTTCAAAGACTTGCCTAAGTACGATGCCAAGAATCAGGAGTATAAGTACTCAGTTGAAGAAGTAAATGTTCCAGACGGCTACAAGGTGTCGTATTTAGGAAATGATATATTTAACACCAGAGAAACAGAATTTGTGTTTGAACAGAATAACTTTAACCTTGAATTTGGAAATGCTGAAATAAAAGGTCAATCTGGGTCAAAAATCATTGATGAAGACACGCTAACGTCTTTCAAAGGTAAGAAAATTTGGAAAAATGATACGGCAGAAAATCGTCCCCAAGCCATTCAAGTGCAGCTTTATGCTGATGGAGTGGCTGTGGAAGGTCAAACCAAATTTATTTCTGGCTCAGGTAATGAGTGGTCATTTGAGTTTAAAAACTTGAAGAAGTATAATGGAACAGGTAATGACATCATTTACTCAGTTAAAGAAGTAACTGTTCCAACAGGTTATGATGTGACTTACTCAGCTAATGATATTATTAATACCAAACGTGAGGTTATTACACAACAAGGACCGAAACTAGAGATTGAAGAAACGCTTCCGCTAGAATCAGGTGCTTCAGGCGGTACCACTACTGTCGAAGACTCACGCCCAGTTGATACCTTATCAGGTTTATCAAGTGAGCAAGGTCAGTCCGGTGATATGACAATTGAAGAAGATAGTGCTACCCATATTAAATTCTCAAAACGTGATATTGACGGCAAAGAGTTAGCTGGTGCAACTATGGAGTTGCGTGATTCATCTGGTAAAACTATTAGTACATGGATTTCAGATGGACAAGTGAAAGATTTCTACCTGATGCCAGGAAAATATACATTTGTCGAAACCGCAGCACCAGACGGTTATGAGATAGCAACTGCTATTACCTTTACAGTTAATGAGCAAGGTCAGGTTACTGTAAATGGCAAAGCAACTAAAGGTGACACTCATATTGTCATGGTTGATGCTTACAAGCCAACTAAGGGTTCAGGTCAGGTTATTGATATTGAAGAAAAGCTTCCAGACGAGCAAGGTCATTCTGGTTCAACTACTGAAATAGAAGACAGTAAATCTTCAGACCTTATCATTGGCGGTCAAGGTGAAGTTGTTGACACAACAGAAGACACACAAAGTGGTATGACGGGCCATTCTGGCTCAACTACTGAAATAGAAGATAGCAAGTCTTCAGACCTTATCATTGGTGGTCAGGGGCAGGTTGTCGAGACAACAGAGGATACCCAAACTGGCATGTACGGGGATTCTGGTTGTAAAACGGAAGTCGAAGATACTAAACTAGTACAATCCTTCCACTTTGATAACAAGGAACCAGAAAGTAACTCTGAGATTCCTAAAAAAGATAAGCCAAAGAGTAATAGTAGTTTACCAGCAACTGGTGAGAAGCAACATAATATGTTCTTTTGGATGGTTACTTCTTGCTCACTTATTAGTAGTGTTTTTGTAATATCACTAAAATCCAAAAAACGCCTATCATCATGTTAA SEQ ID NO: 121MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGSFEIKKVDQNNKPLPGATFSLTSKDGKGTSVQTFTSNDKGIVDAQNLQPGTYTLKEETAPDGYDKTSRTWTVTVYENGYTKLVENPYNGEIISKAGSKDVSSSLQLENPKMSVVSKYGKTEVSSGAADFYRNHAAYFKMSFELKQKDKSETINPGDTFVLQLDRRLNPKGISQDIPKIIYDSANSPLAIGKYHAENHQLIYTFTDYIAGLDKVQLSAELSLFLENKEVLENTSISNFKSTIGGQEITYKGTVNVLYGNESTKESNYITNGLSNVGGSIESYNTETGEFVWYVYVNPNRTNTPYATMNLWGFGRARSNTSDLENDANTSSAELGEIQVYEVPEGEKLPSSYGVDVTKLTLRTDITAGLGNGFQMTKRQRIDFGNNIQNKAFIIKVTGKTDQSGKPLVVQSNLASERGASEYAAFTPVGGNVYFQNEIALSPSKGSGSGKSEFTKPSITVANLKRVAQLRFKKMSTDNVPLPEAAFELRSSNGNSQKLEASSNTQGEVHFKDLTSGTYDLYETKAPKGYQQVTEKLATVTVDTTKPAEEMVTWGSPHSSVKVEANKEVTIVNHKETLTFSGKKIWENDRPDQRPAKIQVQLLQNGQKMPNQIQEVTKDNDWSYHFKDLPKYDAKNQEYKYSVEEVNVPDGYKVSYLGNDIFNTRETEFVFEQNNFNLEFGNAETKGQSGSKIIDEDTLTSFKGKKIWKNDTAENRPQAIQVQLYADGVAVEGQTKFISGSGNEWSFEFKNLKKYNGTGNDTIYSVKEVTVPTGYDVTYSANDIINTKREVITQQGPKLEIEETLPLESGASGGTTTVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDTHTVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDLIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGQVVETTEDTQTGMYGDSGCKTEVEDTKLVQSFHFDNKEPESNSEIPKKDKPKSNTSLPATGEKQHNMFFWMVTSCSLISSVFVI SLKSKKRLSSC

19224141 contains an amino acid motif indicative of a cell wall anchor:SEQ ID NO: 181 LPATG (shown in italics in SEQ ID NO: 121, above). Insome recombinant host cell systems, it may be preferable to remove thismotif to facilitate secretion of a recombinant 19224141 protein from thehost cell. Alternatively, in other recombinant host cell systems, it maybe preferable to use the cell wall anchor motif to anchor therecombinantly expressed protein to the cell wall. The extracellulardomain of the expressed protein may be cleaved during purification orthe recombinant protein may be left attached to either inactivated hostcells or cell membranes in the final composition.

Two pilin motifs, discussed above, containing conserved lysine (K)residues have also been identified in 19224141. The pilin motifsequences are underlined in SEQ ID NO: 121, below. Conserved lysine (K)residues are also marked in bold, at amino acid residues 157 and 163 andat amino acid residues 216, 224, and 238. The pilin sequence, inparticular the conserved lysine residues, are thought to be importantfor the formation of oligomeric, pilus-like structures. Preferredfragments of 19224141 include at least one conserved lysine residue.Preferably, fragments include at least one pilin sequence. SEQ ID NO:121 MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGSFEIKKVDQNNKPLPGATFSLTSKDGKGTSVQTFTSNDKGIVDAQNLQPGTYTLKEETAPDGYDKTSRTWTVTVYENGYTKLVENPYNGEIISKAGSKDVSSS LQLENPKMSVVSKYGKTEVSSGAADFYRNHAAYFKMSFELKQKDKSETINPGDTFVLQLDRRLNPKGISQDIPKIIYDSANSPLAIGK YHAENHQLIYTFTDYIAGLDKVQLSAELSLFLENKEVLENTSISNFKSTIGGQEITYKGTVNVLYGNESTKESNYITNGLSNVGGSIESYNTETGEFVWYVYVNPNRTNTPYATMNLWGFGRARSNTSDLENDANTSSAELGEIQVYEVPEGEKLPSSYGVDVTKLTLRTDITAGLGNGFQMTKRQRIDFGNNIQNKAFIIKVTGKTDQSGKPLVVQSNLASERGASEYAAFTPVGGNVYFQNEIALSPSKGSGSGKSEFTKPSITVANLKRVAQLRFKKMSTDNVPLPEAAFELRSSNGNSQKLEASSNTQGEVHFKDLTSGTYDLYETKAPKGYQQVTEKLATVTVDTTKPAEEMVTWGSPHSSVKVEANKEVTIVNHKETLTFSGKKIWENDRPDQRPAKIQVQLLQNGQKMPNQIQEVTKDNDWSYHFKDLPKYDAKNQEYKYSVEEVNVPDGYKVSYLGNDIFNTRETEFVFEQNNFNLEFGNAETKGQSGSKIIDEDTLTSFKGKKIWKNDTAENRPQAIQVQLYADGVAVEGQTKFISGSGNEWSFEFKNLKKYNGTGNDTIYSVKEVTVPTGYDVTYSANDIINTKREVITQQGPKLEIEETLPLESGASGGTTTVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDTHTVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDLIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGQVVETTEDTQTGMYGDSGCKTEVEDTKLVQSFHFDNKEPESNSEIPKKDKPKSNTSLPATGEKQHNMFFWMVTSCSLISSVFVI SLKSKKRLSSC

Two E boxes containing conserved glutamic residues have been identifiedin 19224141. The E-box motifs are underlined in SEQ ID NO: 121, below.The conserved glutamic acid (E) residues, at amino acid residues 567 and944, are marked in bold. The E box motifs, in particular the conservedglutamic acid residues, are thought to be important for the formation ofoligomeric pilus-like structures of 19224141. Preferred fragments of19224141 include at least one conserved glutamic acid residue.Preferably, fragments include at least one E box motif. SEQ ID NO: 121MTQKNSYKLSFLLSLTGFILGLLLVFIGLSGVSVGHAETRNGANKQGSFEIKKVDQNNKPLPGATFSLTSKDGKGTSVQTFTSNDKGIVDAQNLQPGTYTLKEETAPDGYDKTSRTWTVTVYENGYTKLVENPYNGEIISKAGSKDVSSSLQLENPKMSVVSKYGKTEVSSGAADFYRNHAAYFKMSFELKQKDKSETINPGDTFVLQLDRRLNPKGISQDIPKIIYDSANSPLAIGKYHAENHQLIYTFTDYIAGLDKVQLSAELSLFLENKEVLENTSISNFKSTIGGQEITYKGTVNVLYGNESTKESNYITNGLSNVGGSIESYNTETGEFVWYVYVNPNRTNTPYATMNLWGFGRARSNTSDLENDANTSSAELGEIQVYEVPEGEKLPSSYGVDVTKLTLRTDITAGLGNGFQMTKRQRIDFGNNIQNKAFIIKVTGKTDQSGKPLVVQSNLASERGASEYAAFTPVGGNVYFQNEIALSPSKGSGSGKSEFTKPSITVANLKRVAQLRFKKMSTDNVPLPEAAFELRSSNGNSQKLEASSNTQGEVHFKDLTSGTYDLYETKAPKGYQQVTEKLATVTVDTTKPAEEMVTWGSPHSSVKVEANKEVTIVNHKETLTFSGKKIWENDRPDQRPAKIQVQLLQNGQKMPNQIQEVTKDNDWSYHFKDLPKYDAKNQEYKYSVEEVNVPDGYKVSYLGNDIFNTRETEFVFEQNNFNLEFGNAETKGQSGSKIIDEDTLTSFKGKKIWKNDTAENRPQAIQVQLYADGVAVEGQTKFISGSGNEWSFEFKNLKKYNGTGNDTIYSVKEVTVPTGYDVTYSANDIINTKREVITQQGPKLEIEETLPLESGASGGTTTVEDSRPVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDIDGKELAGATMELRDSSGKTISTWISDGQVKDFYLMPGKYTFVETAAPDGYEIATAITFTVNEQGQVTVNGKATKGDTHTVMVDAYKPTKGSGQVIDIEEKLPDEQGHSGSTTEIEDSKSSDLIIGGQGEVVDTTEDTQSGMTGHSGSTTEIEDSKSSDVIIGGQGQVVETTEDTQTGMYGDSGCKTEVEDTKLVQSFHFDNKEPESNSEIPKKDKPKSNTSLPATGEKQHNMFFWMVTSCSLISSVFVI SLKSKKRLSSC

As discussed above, applicants have also determined the nucleotide andencoded amino acid sequence of fimbrial structural subunits in severalother GAS AI-4 strains of bacteria. Examples of sequences of thesefimbrial structural subunits are set forth below.

M12 strain isolate 20010296 is a GAS AI-4 strain of bacteria.20010296_fimbrial is thought to be a fimbrial structural subunit of M12strain isolate 20010296. An example of a nucleotide sequence encodingthe 20010296_fimbrial protein (SEQ ID NO: 257) and a 20010296_fimbrialprotein amino acid sequence (SEQ ID NO: 258) are set forth below. SEQ IDNO: 257 agcagtggtcaattaacaataaaaaaatcaattacaaattttaatgatgatacacttttgatgcctaagacagactatacttttagcgttaatccggatagtgcggctacaggtactgaaagtaatttaccaattaaaccaggtattgctgttaacaatcaagatattaaggtttcttattctaatactgataagacatcaggtaaagaaaaacaagttgttgttgactttatgaaagttacttttcctagcgttggtatttaccgttatgttgttaccgagaataaagggacagcagaaggagttacatatgatgatacaaaatggttagttgacgtctatgttggtaataatgaaaagggaggtcttgaaccaaagtatattgtatctaaaaaaggagattctgctactaaagaaccaatccagtttaataattcattcgaaacaacgtcattaaaaattgaaaaggaagttactggtaatacaggagatcataaaaaagcatttaactttacattaacattgcaaccaaatgaatactatgaggcaagttcggttgtgaaaattgaagagaacggacaaacgaaagatgtgaaaattggggaggcatataagtttactttgaacgatagtcagagtgtgatattgtctaaattaccagttggtattaattataaagttgaagaagcagaagctaatcaaggtggatatactacaacagcaactttaaaagatggagaaaagttatctacttataacttaggtcaggaacataaaacagacaagactgctgatgaaat cgt SEQ ID NO: 258SSGQLTIKKSITNFNDDTLLMPKTDYTFSVNPDSAATGTESNLPIKPGIAVNNQDIKVSYSNTDKTSGKEKQVVVDFMKVTFPSVGIYRYVVTENKGTAEGVTYDDTKWLVDVYVGNNEKGGLEPKYIVSKKGDSATKEPIQFNNSFETTSLKIEKEVTGNTGDHKKAFNFTLTLQPNEYYEASSVVKIEENGQTKDVKIGEAYKFTLNDSQSVILSKLPVGINYKVEEAEANQGGYTTTATLKDGEKLS TYNLGQEHKTDKTADEIV

M12 strain isolate 20020069 is a GAS AI-4 strain of bacteria.20020069_fimbrial is thought to be a fimbrial structural subunit of M12strain isolate 20020069. An example of a nucleotide sequence encodingthe 20020069_fimbrial protein (SEQ ID NO: 259) and a 20020069_fimbrialprotein amino acid sequence (SEQ ID NO: 260) are set forth below. SEQ IDNO: 259 agcagtggtcaattaacaataaaaaaatcaattacaaattttaatgatgatacacttttgatgcctaagacagactatacttttagcgttaatccggatagtgcggctacaggtactgaaagtaatttaccaattaaaccaggtattgctgttaacaatcaagatattaaggtttcttattctaatactgataagacatcaggtaaagaaaaacaagttgttgttgactttatgaaagttacttttcctagcgttggtatttaccgttatgttgttaccgagaataaagggacagcagaaggagttacatatgatgatacaaaatggttagttgacgtctatgttggtaataatgaaaagggaggtcttgaaccaaagtatattgtatctaaaaaaggagattctgctactaaagaaccaatccagtttaataattcattcgaaacaacgtcattaaaaattgaaaaggaagttactggtaatacaggagatcataaaaaagcatttaactttacattaacattgcaaccaaatgaatactatgaggcaagttcggttgtgaaaattgaagagaacggacaaacgaaagatgtgaaaattggggaggcatataagtttactttgaacgatagtcagagtgtgatattgtctaaattaccagttggtattaattataaagttgaagaagcagaagctaatcaaggtggatatactacaacagcaactttaaaagatggagaaaagttatctacttataacttaggtcaggaacataaaacagacaagactgctgatgaaat cgt SEQ ID NO: 260SSGQLTIKKSITNFNDDTLLMPKTDYTFSVNPDSAATGTESNLPIKPGIAVNNQDIKVSYSNTDKTSGKEKQVVVDEMKVTFPSVGIYRYVVTENKGTAEGVTYDDTKWLVDVYVGNNEKGGLEPKYIVSKKGDSATKEPIQFNNSEETTSLKIEKEVTGNTGDHKKAFNFTLTLQPNEYYEASSVVKIEENGQTKDVKIGEAYKFTLNDSQSVILSKLPVGINYKVEEAEANQGGYTTTATLKDGEKLS TYNLGQEHKTDKTADEIV

M12 strain isolate CDC SS 635 is a GAS AI-4 strain of bacteria. CDC SS635_fimbrial is thought to be a fimbrial structural subunit of M12strain isolate CDC SS 635. An example of a nucleotide sequence encodingthe CDC SS 635_fimbrial protein (SEQ ID NO: 261) and a CDC SS635_fimbrial protein amino acid sequence (SEQ ID NO: 262) are set forthbelow. SEQ ID NO: 261 gagacggcaggggttgttagcagtggtcaattaacaataaaaaaatcaattacaaattttaatgatgatacacttttgatgcctaagacagactatacttttagcgttaatccggatagtgcggctacaggtactgaaagtaatttaccaattaaaccaggtattgctgttaacaatcaagatattaaggtttcttattctaatactgataagacatcaggtaaagaaaaacaagttgttgttgactttatgaaagttacttttcctagcgttggtatttaccgttatgttgttaccgagaataaagggacagcagaaggagttacatatgatgatacaaaatggttagttgacgtctatgttggtaataatgaaaagggaggtcttgaaccaaagtatattgtatctaaaaaaggagattctgctactaaagaaccaatccagtttaataattcattcgaaacaacgtcattaaaaattgaaaaggaagttactggtaatacaggagatcataaaaaagcatttaactttacattaacattgcaaccaaatgaatactatgaggcaagttcggttgtgaaaattgaagagaacggacaaacgaaagatgtgaaaattggggaggcatataagtttactttgaacgatagtcagagtgtgatattgtctaaattaccagttggtattaattataaagttgaagaagcagaagctaatcaaggtggatatactacaacagcaactttaaaagatggagaaaagttatctacttataacttaggtcaggaacataaaacagacaagactgctgatgaaatcgttgtcacaaataaccgtgacact SEQ ID NO: 262ETAGVVSSGQLTTKKSITNFNDDTLLMPKTDYTFSVNPDSAATGTESNLPIKPGIAVNNQDIKVSYSNTDKTSGKEKQVVVDPMKVTFPSVGIYRYVVTENKGTAEGVTYDDTKWLVDVYVGNNEKGGLEPKYIVSKKGDSATKEPIQFNNSFETTSLKIEKEVTGNTGDHKKAFNFTLTLQPNEYYEASSVVKIEENGQTKDVKIGEAYKFTLNDSQSVILSKLPVGINYKVEEAEANQGGYTTTATLKDGEKLSTYNLGQEHKTDKTADEIVVTNNRDT

M5 strain isolate ISS 4883 is a GAS AI-4 strain of bacteria.ISS4883_fimbrial is thought to be a fimbrial structural subunit of M5strain isolate ISS 4883. An example of a nucleotide sequence encodingthe ISS4883_fimbrial protein (SEQ ID NO: 265) and an ISS4883_fimbrialprotein amino acid sequence (SEQ ID NO: 266) are set forth below. SEQ IDNO: 265 gagacggcaggggttgtaacaggaaaatcactacaagttacaaagacaatgacttatgatgatgaagaggtgttaatgcccgaaaccgcctttacttttactatagagcctgatatgactgcaagtggaaaagaaggcgacctagatattaaaaatggaattgtagaaggcttagacaaacaagtaacagtaaaatataagaatacagataaaacatctcaaaaaactaaaatagcacaatttgatttttctaaggttaaatttccagctataggtgtttaccgctatatggtttcagagaaaaacgataaaaaagacggaattaggtacgatgataaaaagtggactgtagatgtttatgttgggaataaggccaataacgaagaaggtttcgaagttctatatattgtatcaaaagaaggtacttctagtactaaaaaaccaattgaatttacaaactctattaaaactacttccttaaaaattgaaaaacaaataactggcaatgcaggagatcgtaaaaaatcattcaacttcacattaacattacaaccaagtgaatattataaaaccggatcagttgtgaaaatcgaacaggatggaagtaaaaaagatgtgacgataggaacgccttacaaatttactttgggacacggtaagagtgtcatgttatcgaaattaccaattggtatcaattactatcttagtgaagacgaagcgaataaagacggttacactacaacggcaacattaaaagaacaaggcaaagaaaagagttccgatttcactttgagtactcaaaaccagaaaacagacgaatctgctgacgaaatcgttgtcacaaataagc gtgacactctcgag SEQ IDNO: 266 ETAGVVTGKSLQVTKTMTYDDEEVLMPETAFTFTIEPDMTASGKEGDLDIKNGIVEGLDKQVTVKYKNTDKTSQKTKIAQFDFSKVKFPAIGVYRYMVSEKNDKKDGIRYDDKKWTVDVYVGNKANNEEGFEVLYIVSKEGTSSTKKPIEFTNSIKTTSLKIEKQITGNAGDRKKSFNFTLTLQPSEYYKTGSVVKIEQDGSKKDVTIGTPYKFTLGHGKSVMLSKLPIGINYYLSEDEANKDGYTTTATLKEQGKEKSSDFTLSTQNQKTDESADEIVVTNKRDTLE

M50 strain isolate ISS4538 is a GAS AI-4 strain of bacteria.ISS4538_fimbrial is thought to be a fimbrial structural subunit of M50strain ISS 4538. An example of a nucleotide sequence encoding theISS4538_fimbrial protein (SEQ ID NO: 255) and an ISS4538_fimbrialprotein amino acid sequence (SEQ ID NO: 256) are set forth below. SEQ IDNO: 255 atgaaaaaaaataaattattacttgctactgcaatcttagcaactgctttaggaacagcttctttaaatcaaaacgtaaaagctgagacggcaggggttgttagcagtggtcaattaacaataaaaaaatcaattacaaattttaatgatgatacacttttgatgcctaagacagactatacttttagcgttaatccggatagtgcggctacaggtactgaaagtaatttaccaattaaaccaggtattgctgttaacaatcaagatattaaggtttcttattctaatactgataagacatcaggtaaagaaaaacaagttgttgttgactttatgaaagttacttttcctagcgttggtatttaccgttatgttgttaccgagaataaagggacagcagaaggagttacatatgatgatacaaaatggttagttgacgtctatgttggtaataatgaaaagggaggtcttgaaccaaagtatattgtatctaaaaaaggagattctgctactaaagaaccaatccagtttaataattcattcgaaacaacgtcattaaaaattgaaaagaaagttactggtaatacaggagatcataaaaaagcatttaactttacattaacattgcaaccaaatgaatactatgaggcaagttcggttgtgaaaattgaagagaacggacaaacgaaagatgtgaaaattggggaggcatataagtttactttgaacgatagtcagagtgtgatattgtctaaattaccagttggtattaattataaagttgaagaagcagaagctaatcaaggtggatatactacaacagcaactttaaaagatggagaaaagttatctacttataacttaggtcaggaacataaaacagacaagactgctgatgaaatcgttgtcacaaataancgngacactcnagttccaacnggtgtngtaggcaccccncctccattcncagttcttancattgnggctantggtggngtnatntatnttacaaaacgnaaaaaagnataa SEQ ID NO: 256MKKNKLLLATAILATALGTASLNQNVKAETAGVVSSGQLTIKKSITNFNDDTLLMPKTDYTFSVNPDSAATGTESNLPIKPGIAVNNQDIKVSYSNTDKTSGKEKQVVVDFMKVTFPSVGIYRYVVTENKGTAEGVTYDDTKWLVDVYVGNNEKGGLEPKYIVSKKGDSATKEPIQFNNSFETTSLKIEKKVTGNTGDHKKAFNFTLTLQPNEYYEASSVVKTEENGQTKDVKIGEAYKFTLNDSQSVILSKLPVGINYKVEEAEANQGGYTTTATLKDGEKLSTYNLGQEHKTDKTADEIVVTNXRDTXVPTGVVGTPPPFXVLXIXAXGGVXYXTKRKKX

There may be an upper limit to the number of GAS proteins which will bein the compositions of the invention. Preferably, the number of GASproteins in a composition of the invention is less than 20, less than19, less than 18, less than 17, less than 16, less than 15, less than14, less than 13, less than 12, less than 11, less than 10, less than 9,less than 8, less than 7, less than 6, less than 5, less than 4, or lessthan 3. Still more preferably, the number of GAS proteins in acomposition of the invention is less than 6, less than 5, or less than4. Still more preferably, the number of GAS proteins in a composition ofthe invention is 3.

The GAS proteins and polynucleotides used in the invention arepreferably isolated, i.e., separate and discrete, from the wholeorganism with which the molecule is found in nature or, when thepolynucleotide or polypeptide is not found in nature, is sufficientlyfree of other biological macromolecules so that the polynucleotide orpolypeptide can be used for its intended purpose.

Examples Other Gram Positive Bacterial Adhesin Island Sequences

The Gram positive bacteria AI polypeptides of the invention can, ofcourse, be prepared by various means (e.g. recombinant expression,purification from a gram positive bacteria, chemical synthesis etc.) andin various forms (e.g. native, fusions, glycosylated, non-glycosylatedetc.). They are preferably prepared in substantially pure form (i.e.substantially free from other streptococcal or host cell proteins) orsubstantially isolated form.

The Gram positive bacteria AI proteins of the invention may includepolypeptide sequences having sequence identity to the identified Grampositive bacteria proteins. The degree of sequence identity may varydepending on the amino acid sequence (a) in question, but is preferablygreater than 50% (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99%, 99.5% or more). Polypeptides havingsequence identity include homologs, orthologs, allelic variants andmutants of the identified Gram positive bacteria proteins. Typically,50% identity or more between two proteins is considered to be anindication of functional equivalence. Identity between proteins ispreferably determined by the Smith-Waterman homology search algorithm asimplemented in the MPSRCH program (Oxford Molecular), using an affinitygap search with parameters gap open penalty=12 and gap extensionpenalty=1.

The Gram positive bacteria adhesin island polynucleotide sequences mayinclude polynucleotide sequences having sequence identity to theidentified Gram positive bacteria adhesin island polynucleotidesequences. The degree of sequence identity may vary depending on thepolynucleotide sequence in question, but is preferably greater than 50%(e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98%, 99%, 99.5% or more).

The Gram positive bacteria adhesin island polynucleotide sequences ofthe invention may include polynucleotide fragments of the identifiedadhesin island sequences. The length of the fragment may vary dependingon the polynucleotide sequence of the specific adhesin island sequence,but the fragment is preferably at least 10 consecutive polynucleotides,(e.g. at least 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80,90, 100, 150, 200 or more).

The Gram positive bacteria adhesin island amino acid sequences of theinvention may include polypeptide fragments of the identified Grampositive bacteria proteins. The length of the fragment may varydepending on the amino acid sequence of the specific Gram positivebacteria antigen, but the fragment is preferably at least 7 consecutiveamino acids, (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60,70, 80, 90, 100, 150, 200 or more). Preferably the fragment comprisesone or more epitopes from the sequence. The fragment may comprise atleast one T-cell or, preferably, a B-cell epitope of the sequence. T-and B-cell epitopes can be identified empirically (e.g., using PEPSCAN[Geysen et al. (1984) PNAS USA 81:39984002; Carter (1994) Methods Mol.Biol. 36:207-223, or similar methods], or they can be predicted (e.g.,using the Jameson-Wolf antigenic index [Jameson, B A et al. 1988, CABIOS4 (1):1818-186], matrix-based approaches [Raddrizzani and Hammer (2000)Brief Bioinform. 1(2):179-189], TEPITOPE [De Lalla et al. (199) J.Immunol. 163:1725-1729], neural networks [Brusic et al. (1998)Bioinformatics 14(2):121-130], OptiMer & EpiMer [Meister et al. (1995)Vaccine 13(6):581-591; Roberts et al. (1996) AIDS Res. Hum. Retroviruses12(7):593-610], ADEPT [Maksyutov & Zagrebelnaya (1993) Comput. Appl.Biosci. 9(3):291-297], Tsites [Feller & de la Cruz (1991) Nature349(6311):720-721], hydrophilicity [Hopp (1993) Peptide Research6:183-190], antigenic index [Welling et al. (1985) FEBS Lett.188:215-218] or the methods disclosed in Davenport et al. (1995)Immunogenetics 42:392-297, etc. Other preferred fragments include (1)the N-terminal signal peptides of each identified Gram positive bacteriaprotein, (2) the identified Gram positive bacteria protein without theirN-terminal signal peptides, (3) each identified Gram positive bacteriaprotein wherein up to 10 amino acid residues (e.g. 1, 2, 3, 4, 5, 6, 7,8, 9, 10, 15, 20, 25 or more) are deleted from the N-terminus and/or theC-terminus e.g. the N-terminal amino acid residue may be deleted. Otherfragments omit one or more domains of the protein (e.g. omission of asignal peptide, of a cytoplasmic domain, of a transmembrane domain, orof an extracellular domain), and (4) the polypeptides, but without theirN-terminal amino acid residue.

As indicated in the above text, nucleic acids and polypeptides of theinvention may include sequences that:

-   -   (a) are identical (i.e., 100% identical) to the sequences        disclosed in the sequence listing;    -   (b) share sequence identity with the sequences disclosed in the        sequence listing;    -   (c) have 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 single nucleotide or        amino acid alterations (deletions, insertions, substitutions),        which may be at separate locations or may be contiguous, as        compared to the sequences of (a) or (b);    -   (d) when aligned with a particular sequence from the sequence        listing using a pairwise alignment algorithm, a moving window of        x monomers (amino acids or nucleotides) moving from start        (N-terminus or 5′) to end (C-terminus or 3′), such that for an        alignment that extends top monomers (where p>x) there are p−x+1        such windows, each window has at least x·y identical aligned        monomers, where: x is slected from 20, 25, 30, 35, 40, 45, 50,        60, 70, 80, 90, 100, 150, 200; y is selected from 0.50, 0.60,        0.70, 0.75, 0.80, 0.85, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95,        0.96, 0.97, 0.98, 0.99; and if x·y is not an integer then it is        rounded up to the nearest integer. The preferred pairwise        alignment algorithm is the Needleman-Wunsch global alignment        algorithm [Needlman &Wunsch (1970) J. Mol. Biol. 48, 443-453],        using default parameters (e.g., with Gap opening penalty=10.0,        and with Gap extension penalty=0.5, using the EBLOSUM62 scoring        matrix). This algorithm is conveniently implemented in the        needle tool in the EMBOSS package [Rice et al. (2000) Trends        Genet. 16:276-277].

The nucleic acids and polypeptides of the inention may additionally havefurther sequences to the N-terminus/5′ and/or C-terminus/3′ of thesesequences (a) to (d).

All of the Gram positive bacterial sequences referenced herein arepublicly available through PubMed on GenBank.

Streptococcus pneumoniae Adhesin Island Sequences

As discussed above, a S. pneumoniae AI sequence is present in the TIGR4S. pneumoniae genome. Examples of S. pneumoniae AI sequences are setforth below.

SrtD (Sp0468) is a sortase. An example of an amino acid sequence of SrtDis set forth in SEQ ID NO: 80. SEQ ID NO: 80MSRTKLRALLGYLLMLVACLIPIYCFGQMVLQSLGQVKGHATFVKSMTTEMYQEQQNHSLAYNQRLASQNRTVDPFLAEGYEVNYQVSDDPDAVYGYLSIPSLEIMEPVYLGADYHHLGMGLAHVDGTPLPLDGTGIRSVIAGHRAEPSHVFFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNFERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFLGILFVLWKLARLLRGK

SrtC (Sp0467) is a sortase. An example of an amino acid sequence of SrtCis set forth in SEQ ID NO: 81. SEQ ID NO: 81MSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAENATLKPSEILDPFTEQEKKKGVSEYANMLKVHERIGYVEIPAIDQEIPMYVGTSEDILQKGAGLLEGASLPVGGENTHTVITAHRGLPTAELFSQLDKMKKGDIFYLHVLDQVLAYQVDQIVTVEPNDFEPVLIQHGEDYATLLTCTPYMINSHRLLVRGKRIPYTAPIAERNRAVRERGQFWLWLLLGAMAVILLLLYRVYRNRRIVKGL EKQLEGRHVKD

SrtB (SP0466) is a sortase. An example of an amino acid sequence of SrtBis set forth in SEQ ID NO: 82. SEQ ID NO: 82MAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVEIPVIDVDLPVYAGTAEEVLQQGAGHLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVCLIVTLLWITRRLRKKKKQPEKALKALKAARKEVKVEDGQQ

Sp0465 is a hypothetical protein. An example of an amino acid sequenceof Sp0465 is set forth in SEQ ID NO: 83. SEQ ID NO: 83MFLPFLSASLYLQTHHFIAFPNRQSYLLRETRKSHFFLIHHPF

RrgC (SP0464) is a cell wall surface anchor family protein. RrgCcontains a sortase substrate motif VPXTG (SEQ ID NO: 137), shown initalics in SEQ ID NO: 84. SEQ ID NO: 84MISRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDDRVQIVRDLHSWDENKLSSFKKTSFEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKTDTMTTKVKLIKVDQDHNRLEGVGFKLVSVARDVSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIFVTNLPLGNYRFKEVEPLAGYAVTTLDTDVQLVDHQLVTITVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEESGHYTPVLQNGKEVVVTSGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRIDVPDTGEETLYILMLVAILLFGSGYYLTKKPNN

RrgB (Sp0463) is a cell wall surface anchor protein. RrgB contains asortase substrate motif IPXTG (SEQ ID NO: 133), shown in italics in SEQID NO: 85. SEQ ID NO: 85MKSINKFLTMLAALLLTASSLFSAATVFAAGTTTTSVTVHKLLATDGDMDKIANELETGNYAGNKVGVLPANAKEIAGVMFVWTNTNNEIIDENGQTLGVNIDPQTFKLSGAMPATAMKKLTEAEGAKFNTANLPAAKYKIYEIHSLSTYVGEDGATLTGSKAVPIEIELPLNDVVDAHVYPKNTEAKPKIDKDFKGKANPDTPRVDKDTPVNHQVGDVVEYEIVTKIPALANYATANWSDRMTEGLAFNKGTVKVTVDDVALEAGDYALTEVATGFDLKLTDAGLAKVNDQNAEKTVKITYSATLNDKAIVEVPESNDVTFNYGNNPDHGNTPKPNKPNENGDLTLTKTWVDATGAPIPAGAEATFDLVNAQTGKVVQTVTLTTDKNTVTVNGLDKNTEYKFVERSIKGYSADYQEITTAGEIAVKNWKDENPKPLDPTEPKVVTYGKKFVKVNDKDNRLAGAEFVIANADNAGQYLARKADKVSQEEKQLVVTTKDALDRAVAAYNALTAQQQTQQEKEKVDKAQAAYNAAVIAANNAFEWVADKDNENVVKLVSDAQGRFEITGLLAGTYYLEETKQPAGYALLTSRQKFEVTATSYSATGQGIEYTAGSGKDDATKVVNKKITIPQTGGIGTIIFAVAGAAIMGIA VYAYVKNNKDEDQLA

RrgA (Sp0462) is a cell wall surface anchor protein. RrgA contains asortase substrate motif YPXTG (SEQ ID NO: 186), indicated in italics inSEQ ID NO: 86. SEQ ID NO: 86MLNRETHMKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRIYQVNNLDDNQYGIELTVSGKTVYEQKDKSVPLDVVILLDNSNSMSNIRNKNARRAERAGEATRSLIDKITSDSENRVALVTYASTIFDGTEFTVEKGVADKNGKRLNDSLFWNYDQTSFTTNTKDYSYLKLTNDKNDIVELKNKVPTEAEDHDGNRLMYQFGATFTQKALMKADEILTQQARQNSQKVIFHITDGVPTMSYPINFNHATFAPSYQNQLNAFFSKSPNKDGILLSDFITQATSGEHTIVRGDGQSYQMFTDKTVYEKGAPAAFPVKPEKYSEMKAAGYAVIGDPINGGYIWLNWRESILAYPFNSNTAKITNHGDPTRWYYNGNIAPDGYDVFTVGIGINGDPGTDEATATSFMQSISSKPENYTNVTDTTKILEQLNRYFHTIVTEKKSIENGTITDPMGELIDLQLGTDGRFDPADYTLTANDGSRLENGQAVGGPQNDGGLLKNAKVLYDTTEKRIRVTGLYLGTDEKVTLTYNVRLNDEFVSNKFYDTNGRTTLHPKEVEQNTVRDFPIPKIRDVRKYPEITISKEKKLGDIEFIDVNDNDKKPLRGAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEFTNDKHYITNEPIPPKREYPRTGGIGMLPFYLIGCMMMGGVLLYTRKHP

RlrA (Sp0461) is a transcriptional regulator. An example of an aminoacid sequence for RlrA is set forth in SEQ ID NO: 87. SEQ ID NO: 87MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSILQELQETFEEELTFNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKILRFFLLQGNQSFNEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGTEIYDLNDGSMDWVTHMIVQSNSQLSHELLEITPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKEKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKAIVQEWMTEQKIEGVIDQHRLYLESLYLTETIFSSLPAIPIKIILNNQADVNLTKSIILRNFTDKVASVTGYNILISPPPSEEHLTEPLIIITTKEYLPYVKKQYPKGKHHFLTIALDLHVSQQRLIYQTIVDIRKEAFDKRVAM IAKKAHYLL

As discussed above, a S. pneumoniae AI sequence is present in the S.pneumoniae strain 670 genome. Examples of S. pneumoniae AI sequences areset forth below.

Orf1_(—)670 is a transposase. An example of an amino acid sequence oforf1_(—)670 is set forth in SEQ ID NO: 171. SEQ ID NO: 171MEHINHTTLLIGIKDKNITLNKAIQHDTHIEVFATLDYHPPKCKHCKGKQIKYDFQKPSKIPFIEIGGFPSLIHLKKRRFQCKSCRKVTVAETTLVQKNCQISEMVRQKIAQLLLNREALTHIASKLAISTSTSTVYRKLKQFHFQEDYTTLPEILSWDEFSYQKGKLAFIAQDFNTKKIMTILDNRRQTTIRNHFFKYSKEARKKVKVVTVDMSGSYIPLIKKLFPNAKIVLDRFHIVQHMSRALNQTRINIMKQFDDKSLEYRALKYYWKFILKDSRKLSLKPFYARTFRETLTPRECLKKTFTLVPELKDYYDLYQLLLFHLQEKNTDQFWGLIQDTLPHLNRTFKTTLSTFICYKNYITNAIELPYSNAKLEATNKLIKDIKRNAFGFRNFENFKK RIFIALNTKKERTKFVLSRA

Orf2_(—)670 is a transcriptional regulator. An example of an amino acidsequence of Orf2_(—)670 is set forth in SEQ ID NO: 172. SEQ ID NO: 172MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSILQELQETFEEELTFNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKILRFFLLQGNQSFNEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGIEIYDLNDGSMDWVTHMIVQSNSQLSHELLEITPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKFKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKAIVQEWMTEQKIEGVIDQHRLYLFSLYLTETIFSSLPAIPIFIILNNQADVNLIKSIILRNETDKVASVTGYNILISPPPSEEHLTEPLIIITTKEYLPYVKKQYPKGKHHFLTIALDLHVSQQRLIYQTIVDIRKEAFDKRVAM IAKKAHYLL

Orf3_(—)670 is a cell wall surface anchor family proten. An example ofan amino acid sequence of Orf3_(—)670 is set forth in SEQ ID NO: 173.SEQ ID NO: 173 MLNRETHMKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRIYQVNNLDDNQYGIELTVSGKTTVETKEASTPLDVVILLDNSNSMSNIRHNHAHRAEKAGEATPALVDKITSNPDNRVALVTYGSTIFDGSEATVEKGVADANGKILNDSALWTEDRTTFTAKTYNYSFLNLTSDPTDIQTIKDRIPSDAEELNKDKLMYQFGATFTQKALMTADDILTKQARPNSKKVIFHITDGVPTMSYPINFKYTGTTQSYRTQLNNFKAKTPNSSGILLEDFVTWSADGEHKIVRGDGESYQMFTKKPVTDQYGVHQILSITSMEQRAKLVSAGYRFYGTDLYLYWRDSILAYPFNSSTDWITNHGDPTTWYYNGNMAQDGYDVFTVGVGVNGDPGTDEATATRFMQSISSSPDNYTNVADPSQILQELNRYFYTIVNEKKSIENGTITDPMGELIDFQLGADGRFDPADYTLTANDGSSLVNNVPTGGPQNDGGLLKNAKVFYDTTEKRIRVTGLYLGTGEKVTLTYNVRLNDQFVSNKFYDTNGRTTLHPKEVEKNTVRDFPIPKIRDVRKYPEITIPKEKKLGEIEFIKINKNDKKPLRDAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEFTNDKHYITNEPIPPKREYPRTGGIGMLPFYLIGCMMMGGVLLYTRKHP

Orf4_(—)670 is a cell wall surface anchor family protein. An example ofan amino acid sequence of orf4_(—)670 is set forth in SEQ ID NO: 174.SEQ ID NO: 174 MKSINKFLTMLAALLLTASSLFSAATVFAADNVSTAPDAVTKTLTIHKLLLSEDDLKTWDTNGPKGYDGTQSSLKDLTGVVAEEIPNVYFELQKYNLTDGKEKENLKDDSKWTTVHGGLTTKDGLKIETSTLKGVYRIREDRTKTTYVGPNGQVLTGSKAVPALVTLPLVNNNGTVIDAHVFPKNSYNKPVVDKRIADTLNYNDQNGLSIGTKIPYVVNTTIPSNATFATSFWSDEMTEGLTYNEDVTITLNNVAMDQADYEVTKGNNGFNLKLTEAGLAKINGKDADQKIQITYSATLNSLAVADIPESNDITYHYGNHQDHGNTPKPTKPNNGQITVTKTWDSQPAPEGVKATVQLVNAKTGEKVGAPVELSENNWTYTWSGLDNSIEYKVEEEYNGYSAEYTVESKGKLGVKNWKDNNPAPINPEEPRVKTYGKKFVKVDQKDTRLENAQFVVKKADSNKYIAFKSTAQQAADEKAAATAKQKLDAAVAAYTNAADKQAAQALVDQAQQEYNVAYKEAKFGYVEVAGKDEAMVLTSNTDGQFQISGLAAGTYKLEEIKAPEGFAKIDDVEFVVGAGSWNQGEFNYLKDVQKNDATKVVNKKITIPQTGGIGTIIFAVAGAAIMGIAVYAYVKNNKDEDQLA

Orf5_(—)670 is a cell wall surface anchor family protein. An example ofan amino acid sequence of orf5_(—)670 is set forth in SEQ ID NO: 175.SEQ ID NO: 175 MTMQKMQKMISRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDDRVQIVRDLHSWDENKSLLFKKTSFEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKTDTMTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIFVTNLPLGNYRFKEVEPLAGYAVTTLDTDVQLVDHQLVTTTVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEESGHYTPVLQNGKEVVVTSGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRIDVPDTGEETLYILMLVAILLFGSGYYLTKKPN N

Orf6_(—)670 is a sortase. An example of an amino acid sequence oforf6_(—)670 is set forth in SEQ ID NO: 176. SEQ ID NO: 176MLIKMVKTKKQKRNNLLLGVVFFIGMAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVEIPVIDVDLPVYAGTAEEVLQQGAGHLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKKQPEKALKALKAARKEVKVE DGQQ

Orf7_(—)670 is a sortase. An example of an amino acid sequence oforf7_(—)670 is set forth in SEQ ID NO: 177. SEQ ID NO: 177VSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFTEQEKKKGVSEYANMLKVHERIGYVEIPAIDQEIPMYVGTSEEILQKGAGLLEGASLPVGGENTHTVVTAHRGLPTAELFSQLDKMKKGDVFYLHVLDQVLAYQVDQILTVEPNDEEPVLIQHGEDYATLLTCTPYMINSHRLLVRGKRIPYTAPTAERNRAVRERGQFWLWLLLAALVMILVLSYGVYRHRRIVKGL EKQLEEHHVKG

Orf8_(—)670 is a sortase. An example of an amino acid sequence oforf8_(—)670 is set forth in SEQ ID NO: 178. SEQ ID NO: 178MSKAKLQKLLGYLLMLVALVIPVYCEGQMVLQSLGQVKGHEIFSESVTADSYQEQLQRSLDYNQRLDSQNRIVDPFLAEGYEVNYQVSDDPDAVYGYLSIPSLEIMEPVYLGADYHHLAMGLAHVDGTPLPVEGKGIRSVIAGHRAEPSHVFFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNFERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFLGILFVLWKLARLLRGK

As discussed above, a S. pneumoniae AI sequence is present in the 19AHungary 6 S. pneumoniae genome. Examples of S. pneumoniae AI sequencesfrom 19A Hungary 6 are set forth below.

ORF2_(—)19AH is a transcriptional regulator. An example of an amino acidsequence of ORF2_(—)19AH is set forth in SEQ ID NO: 187. SEQ ID NO: 187MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSILQELQETFEEELTFNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKILRFFLLQGNQSFNEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGIEIYDLNDGSMDWVTHMIVQSNSQLSHELLEITPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKFKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKAIVQEWMTEQKIEGVTDQHRLYLFSLYLTETIFSSLPAIPIFIILNNQADVNLIKSTILRNFTDKVASVTGYNILISPPPSEEHLTEPLIIITTKEYLPYVKKQYPKGKHHFLTIALDLHVSQQRLIYQTIVDIRKEAFDKRVAM IAKKAHYLL

ORF3_(—)19AH is a cell wall surface protein. An example of an amino acidsequence of ORF3_(—)19AH is set forth in SEQ ID NO: 188. SEQ ID NO: 188MKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRIYQVNNLDDNQYGIELTVSGKTTVETKEASTPLDVVILLDNSNSMSNTRHNHAHRAEKAGEATRALVDKITSNPDNRVALVTYGSTIFDGSEATVEKGVADANGKILNDSALWTFDRTTFTAKTYNYSFLNLTSDPTDIQTIKDRIPSDAEELNKDKLMYQFGATFTQKALMTADDILTKQARPNSKKVIFHITDGVPTMSYPINFKYTGTTQSYRTQLNNFKAKTPNSSGILLEDFVTWSADGEHKIVRGDGESYQMFTKKPVTDQYGVHQILSITSMEQRAKLVSAGYRFYGTDLYLYWRDSILAYPFNSSTDWITNHGDPTTWYYNGNMAQDGYDVFTVGVGVNGDPGTDEATATRFMQSISSSPDNYTNVADPSQILQELNRYFYTTVNEKKSIENGTITDPMGELIDFQLGADGRFDPADYTLTANDGSSLVNNVPTGGPQNDGGLLKNAKVFYDTTEKRIRVTGLYLGTGEKVTLTYNVRLNDQFVSNKFYDTNGRTTLHPKEVEKNTVRDFPIPKIRDVRKYPEITIPKEKKLGETEFIKINKNDKKPLRDAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEFTNDKHYITNEPIPPKREYPRTGGIGMLPFYLIGCMMMGGVLLYTRKNP

ORF4_(—)19AH is a cell wall surface protein. An example of an amino acidsequence of ORF4_(—)19AH is set forth in SEQ ID NO: 189. SEQ ID NO: 189MKSINKFLTMLAALLLTASSLFSAATVFAADNVSTAPDAVTKTLTIHKLLLSEDDLKTWDTNGPKGYDGTQSSLKDLTGVVAEEIPNVYFELQKYNLTDGKEKENLKDDSKWTTVHGGLTTKDGLKIETSTLKGVYRIREDRTKTTYVGPNGQVLTGSKAVPALVTLPLVNNNGTVTDAHVFPKNSYNKPVVDKRIADTLNYNDQNGLSIGTKTPYVVNTTIPSNATFATSFWSDEMTEGLTYNEDVTITLNNVAMDQADYEVTKGXNGFNLKLTEAGLAKINGKDADQKIQITYSATLNSLAVADIPESNDITYHYGNHQDHGNTPKPTKPNNGQITVTKTWDSQPAPEGVKATVQLVNAKTGEKVGAPVELSENNWTYTWSGLDNSIEYKVEEEYNGYSAEYTVESKGKLGVKNWKDNNPAPINPEEPRVKTYGKKFVKVDQKDTRLENAQFVVKKADSNKYIAFKSTAQQAADEKAAATAKQKLDAAVAAYTNAADKQAAQALVDQAQQEYNVAYKEAKEGYVEVAGKDEAMVLTSNTDGQFQISGLAAGTYKLEEIKAPEGFAKIDDVEFVVGAGSWNQGEFNYLKDVQKNDATKVVNKKITIPQTGGIGTIIFAVAGAAIMGIAVYAYVKNNKDEDQLA

ORF5_(—)19AH is a cell wall surface protein. An example of an amino acidsequence of ORF5_(—)19AH is set forth in SEQ ID NO: 190. SEQ ID NO: 190MTMQKMQKMISRIFFVMALCPSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDDRVQIVRDLHSWDENKLSSEKKTSFEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEELFEMTDQTVEPLVIVAKKTDTMTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIFVTNLPLGNYRFKEVEPLAGYAVTTLDTDVQLVDHQLVTITVVNQKLPRGNVDEMKVDGRTNTSLQGAMFKVMKEESGHYTPVLQNGKEVVVTSGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSETIGKDTRKELVTVVKNNKRPRTDVPDTGEETLYILMLVAILLFGSGYYLTKKPN N

ORF6_(—)19AH is a putative sortase. An example of an amino acid sequenceof ORF6_(—)19AH is set forth in SEQ ID NO: 191. SEQ ID NO: 191MLIKMVKTKKQKRNNLLLGVVFFIGMAVMAYPLVSRLYYRVESNQQTADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVEIPVIDVDLPVYAGTAEEVLQQGAGHLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKKQPEKALKALKAARKEVKVE DGQQ

ORF7_(—)19AH is a putative sortase. An example of an amino acid sequenceof ORF7_(—)19AH is set forth in SEQ ID NO: 192. SEQ ID NO: 192MDNSRRSRKKGTKKKKHPLILLLIFLVGFAVAIYPLVSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFTEQEKKKGVSEYANMLKVHERIGYVEIPAIDQEIPMYVGTSEEILQKGAGLLEGASLPVGGENTHTVVTAHRGLPTAELFSQLDKMKKGDVFYLHVLDQVLAYQVDQILTVEPNDFEPVLIQHGEDYATLLTCTPYMINSHRLLVRGKRIPYTAPIAERNRAVRERGQFWLWLLLAALVMILVLSYGVYRHRRIVKGLEKQLEEHHVKG

ORF8_(—)19AH is a putative sortase. An example of an amino acid sequenceof ORF8_(—)19AH is set forth in SEQ ID NO: 193. SEQ ID NO: 193MSKAKLQKLLGYLLMLVALVIPVYCFGQMVLQSLGQVKGHEIFSESVTADSYQEQLQRSLDYNQRLDSQNRIVDPFLAEGYEVNYQVSDDPDAVYGYLSIPSLEIMEPVYLGADYHHLAMGLAHVDGTPLPVEGKGIRSVIAGHPAEPSHVFFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNFERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFMGILFVLWKLARLLRGK

As discussed above, a S. pneumoniae AI sequence is present in the 6BFinland 12 S. pneumoniae genome. Examples of S. pneumoniae AI sequencesfrom 6B Finland 12 are set forth below.

ORF2_(—)6BF is a transcriptional regulator. An example of an amino acidsequence of ORF2_(—)6BF is set forth in SEQ ID NO: 194. SEQ ID NO: 194MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSILQELQETFEEELTFNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKILRFFLLQGNQSENEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGIEIYDLNDGSMDWVTHMIVQSNSQLSHELLEITPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHGQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKFKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKAIVQEWMTEQKIEGVIDQHRLYLFSLYLTETIFSSLPAIPIFIILNNQADVNLIKSIILRNFTDKVASVTGYNILISPPPSEEHLTEPLIIITTKEYLPYVKKQYPKGKHHFLTIALDLHVSQQRLIYQTIVDIRKEAFDKRVAM IAKKAHYLL

ORF3_(—)6BF is a cell wall surface protein. An example of an amino acidsequence of ORF3_(—)6BF is set forth in SEQ ID NO: 195. SEQ ID NO: 195MKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRTYQVNNLDDNQYGIELTVSGKTTVETKEASTPLDVVILLDNSNSMSNIRHNHAHRAEKAGEATRALVDKITSNPDNRVALVTYGSTIFDGSEATVEKGVADANGKILNDSALWTFDRTTFTAKTYNYSFLNLTSDPTDIQTIKDRIPSDAEELNKDKLMYQFGATFTQKALMTADDILTKQARPNSKKVIFHITDGVPTMSYPINFKYTGTTQSYRTQLNNFKAKTPNSSGILLEDFVTWSADGEHKIVRGDGESYQMFTKKPVTDQYGVHQILSITSMEQRAKLVSAGYRFYGTDLYLYWRDSILAYPFNSSTDWITNHGDPTTWYYNGNMAQDGYDVFTVGVGVNGDPGTDEATATRFMQSISSSPDNYTNVADPSQILQELNRYFYTIVNEKKSIENGTITDPMGELIDFQLGADGRFDPADYTLTANDGSSLVNNVPTGGPQNDGGLLKNAKVFYDTTEKRIRVTGLYLGTGEKVTLTYNVRLNDQFVSNKEYDTNGRTTLHPKEVEKNTVRDFPIPKIRDVRKYPEITTPKEKKLGEIEFIKINKNDKKPLRDAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEFTNDKHYITNEPIPPKREYPRTGGIGMLPEYLIGCMMMGGVLLYTRKHP

ORF4_(—)6BF is a cell wall surface protein. An example of an amino acidsequence of ORF4_(—)6BF is set forth in SEQ ID NO: 196. SEQ ID NO: 196MKSINKFLTMLAALLLTASSLFSAATVFAADNVSTAPDAVTKTLTIHKLLLSEDDLKTWDTNGPKGYDGTQSSLKDLTGVVAEEIPNVYFELQKYNLTDGKEKENLKDDSKWTTVHGGLTTKDGLKIETSTLKGVYRIREDRTKTTYVGPNGQVLTGSKAVPALVTLPLVNNNGTVIDAHVFPKNSYNKPVVDKRIADTLNYNDQNGLSIGTKIPYVVNTTIPSNATFATSFWSDEMTEGLTYNEDVTITLNNVAMDQADYEVTKGNNGFNLKLTEAGLAKINGKDADQKIQITYSATLNSLAVADIPESNDITYHYGNHQDHGNTPKPTKPNNGQITVTKTWDSQPAPEGVKATVQLVNAKTGEKVGAPVELSENNWTYTWSGLDNSIEYKVEEEYNGYSAEYTVESKGKLGVKNWKDNNPAPINPEEPRVKTYGKKFVKVDQKDTRLENAQFVVKKADSNKYIAFKSTAQQAADEKAAATAKQKLDAAVAAYTNAADKQAAQALVDQAQQEYNVAYKEAKFGYVEVAGKDEAMVLTSNTDGQFQISGLAAGTYKLEEIKAPEGFAKIDDVEFVVGAGSWNQGEFNYLKDVQKNDATKVVNKKITIPQTGGIGTIIEAVAGAAIMGIAVYAYVKNNKDEDQLA

ORF5_(—)6BF is a cell wall surface protein. An example of an amino acidsequence of ORF5_(—)6BF is set forth in SEQ ID NO: 197. SEQ ID NO: 197MTMQKMQKMISRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDDRVQIVRDLHSWDENKLSSFKKTSFEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKTDTMTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIFVTNLPLGNYRFKEVEPLAGYAVTTLDTDVQLVDHQLVTITVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEESGHYTPVLQNGKEVVVTSGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRIDVPDTGEETLYILMLVAILLFGSGYYLTKKPN N

ORF6_(—)6BF is a putative sortase. An example of an amino acid sequenceof ORF6_(—)6BF is set forth in SEQ ID NO: 198. SEQ ID NO: 198MLIKMVKTKKQKRNNLLLGVVFFIGMAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVEIPVIDVDLPVYAGTAEEVLQQGAGHLEGTSLPTGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKKQPEKALKALKAARKEVKVE DGQQ

ORF7_(—)6BF is a putative sortase. An example of an amino acid sequenceof ORF7_(—)6BF is set forth in SEQ ID NO: 199. SEQ ID NO: 199MDNSRRSRKKGTKKKKHPLILLLIFLVGFAVAIYPLVSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFTEQEKKKGVSEYANMLKVHERIGYVEIPAIDQEIPMYVGTSEEILQKGAGLLEGASLPVGGENTHTVVTAHRGLPTAELFSQLDKMKKGDVFYLHVLDQVLAYQVDQILTVEPNDFEPVLIQHGEDYATLLTCTPYMINSHRLLVRGKRIPYTAPIAERNRAVRERGQFWLWLLLAALVMILVLSYGVYRHRRIVKGLEKQLEEHHVKG

ORF8_(—)6BF is a putative sortase. An example of an amino acid sequenceof ORF8_(—)6BF is set forth in SEQ ID NO: 200. SEQ ID NO: 200MSKAKLQKLLGYLLMLVALVTPVYCFGQMVLQSLGQVKGHEIFSESVTADSYQEQLQRSLDYNQRLDSQNRIVDPFLAEGYEVNYQVSDDPDAVYGYLSIPSLETMEPVYLGADYHHLAMGLAHVDGTPLPVEGKGIRSVIAGHRAEPSHVFFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNFERVAVYQKSDPQTAAVARVAPTKEGQSVSRVATSQWLYRGLVVLAFLGILFVLWKLARLLRGK

As discussed above, a S. pneumoniae AI sequence is present in the 6BSpain 2 S. pneumoniae genome. Examples of S. pneumoniae AI sequencesfrom 6B Spain 2 are set forth below.

ORF2_(—)6BSP is a transcriptional regulator. An example of an amino acidsequence of ORF2_(—)6BSP is set forth in SEQ ID NO: 201. SEQ ID NO: 201MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSTLQELQETFEEELTFNLDTQQVQLTEHHSHQTNYYFHQLYNQSTILKILRFFLLQGNQSFNEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGIEIYDLNDGSMDWVTHMIVQSNSQLSHELLEITPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKFKNILGNDISNSLSPLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKAIVQEWMTEQKIEGVIDQHRLYLFSLYLTETIFSSLPAIPIFITLNNQADVNLTKSIILRNFTDKVASVTGYNILISPPPSEEHLTEPLIIITTKEYLPYVKKQYPKGKHHFLTIALDLHVSQQRLIYQTIVDTRKEAFDKRVAM IAKKAHYLL

ORF3_(—)6BSP is a cell wall surface protein. An example of an amino acidsequence of ORF3_(—)6BSP is set forth in SEQ ID NO: 202. SEQ ID NO: 202MKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRIYQVNNLDDNQYGIELTVSGKTTVETKEASTPLDVVILLDNSNSMSNIRHNHAHRAEKAGEATRALVDKITSNPDNRVALVTYGSTIFDGSEATVEKGVADANGKILNDSALWTFDRTTFTAKTYNYSFLNLTSDPTDIQTIKDRIPSDAEELNKDKLMYQFGATFTQKALMTAKKILTKQARPNSKKVIFHITDGVPTMSYPINFKYTGTTQSYRTQLNNFKAKTPNSSGILLEDFVTWSADGEHKIVRGDGESYQMFTKKPVTDQYGVHQILSITSMEQRAKLVSAGYRFYGTDLYLYWRDSILAYPFNSSTDWITNHGDPTTWYYNGNMAQDGYDVFTVGVGVNGDPGTDEATATRFMQSISSSPDNYTNVADPSQILQELNRYEYTIVNEKKSIENGTITDPMGELIDFQLGADGRFDPADYTLTANDGSSLVNNVPTGGPQNDGGLLKNAKVFYDTTEKRTRVTGLYLGTGEKVTLTYNVRLNDQFVSNKFYDTNGRTTLHPKEVEKNTVRDFPIPKIRDVRKYPEITIPKEKKLGEIEFIKINKNDKKPLRDAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEFTNDKHYITNEPIPPKREYPRTGGIGMLPFYLIGCMMMGGVLLYTRKHP

ORF4_(—)6BSP is a cell wall surface protein. An example of an amino acidsequence of ORF4_(—)6BSP is set forth in SEQ ID NO: 203. SEQ ID NO: 203MKSINKFLTMLAALLLTASSLFSAATVFAADNVSTAPDAVTKTLTIHKLLLSEDDLKTWDTNGPKGYDGTQSSLKDLTGVVAEEIPNVYFELQKYNLTDGKEKENLKDDSKWTTVHGGLTTKDGLKIETSTLKGVYRTREDRTKTTYVGPNGQVLTGSKAVPALVTLPLVNNNGTVTDAHVFPKNSYNKPVVDKRIADTLNYNDQNGLSIGTKIPYVVNTTIPSNATFATSFWSDEMTEGLTYNEDVTITLNNVAMDQADYEVTKGNNGFNLKLTEAGLAKINGKDADQKIQITYSATLNSLAVADIPESNDTTYHYGNHQDHGNTPKPTKPNNGQITVTKTWDSQPAPEGVKATVQLVNAKTGEKVGAPVELSENNWTYTWSGLDNSIEYKVEEEYNGYSAEYTVESKGKLGVKNWKDNNPAPINPEEPRVKTYGKKFVKVDQKDTRLENAQFVVKKADSNKYIAFKSTAQQAADEKAAATAKQKLDAAVAAYTNAADKQAAQALVDQAQQEYNVAYKEAKPGYVEVAGKDEAMVLTSNTDGQEQISGLAAGTYKLEEIKAPEGFAKIDDVEFVVGAGSWNQGEFNYLKDVQKNDATKVVNKKITIPQTGGIGTIIFAVAGAAIMGIAVYAYVKNNKDEDQLA

ORF5_(—)6BSP is a cell wall surface protein. An example of an amino acidsequence of ORF5_(—)6BSP is set forth in SEQ ID NO: 204. SEQ ID NO: 204MTMQKMQKMISRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDDRVQIVRDLHSWDENKLSSFKKTSFEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKTDTMTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIFVTNLPLGNYRFKEVEPLAGYAVTTLDTDVQLVDHQLVTITVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEESGHYTPVLQNGKEVVVTSGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRIDVPDTGEETLYILMLVAILLFGSGYYLTKKPN N

ORF6_(—)6BSP is a putative sortase. An example of an amino acid sequenceof ORF6_(—)6BSP is set forth in SEQ ID NO: 205. SEQ ID NO: 205MLIKMVKTKKQKRNNLLLGVVFFIGMAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVETPVIDVDLPVYAGTAEEVLQQGAGHLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFTAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKKQPEKALKALKAARKEVKVE DGQQ

ORF7_(—)6BSP is a putative sortase. An example of an amino acid sequenceof ORF7_(—)6BSP is set forth in SEQ ID NO: 206. SEQ ID NO: 206MDNSRRSRKKGTKKKKHPLILLLIFLVGFAVAIYPLVSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFTEQEKKKGVSEYANMLKVHERIGYVEIPAIDQEIPMYVGTSEEILQKGAGLLEGASLPVGGENTHTVVTAHRGLPTAELFSQLDKMKKGDVFYLHVLDQVLAYQVDQILTVEPNDFEPVLIQHGEDYATLLTCTPYMINSHRLLVRGKRIPYTAPIAERNRAVRERGQFWLWLLLAALVMILVLSYGVYRHRRIVKGLEKQLEEHHVKG

ORF8_(—)6BSP is a putative sortase. An example of an amino acid sequenceof ORF8_(—)6BSP is set forth in SEQ ID NO: 207. SEQ ID NO: 207MSKAKLQKLLGYLLMLVALVIPVYCFGQMVLQSLGQVKGHEIFSESVTADSYQEQLQRSLDYNQRLDSQNRIVDPFLAEGYEVNYQVSDDPDAVYGYLSIPSLETMEPVYLGADYHHLAMGLAHVDGTPLPVEGKGIRSVIAGHRAEPSHVFFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPTPTFNKRLLVNFERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFLGILFVLWKLARLLRGK

As discussed above, a S. pneumoniae AI sequence is present in the 9VSpain 3 S. pneumoniae genome. Examples of S. pneumoniae AI sequencesfrom 9V Spain 3 are set forth below.

ORF2_(—)9VSP is a transcriptional regulator. An example of an amino acidsequence of ORF2_(—)9VSP is set forth in SEQ ID NO: 208. SEQ ID NO: 208MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSILQELQETFEEELTFNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKILREELLQGNQSFNEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGIEIYDLNDGSMDWVTHMTVQSNSQLSHELLEITPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKFKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKAIVQEWMTEQKIEGVIDQHRLYLFSLYLTETIFSSLPAIPIFIILNNQADVNLIKSIILRNFTDKVASVTGYNILILSPPPSEEHLTEPLIITTKEYLPYVKKQYPKGKHHFLTTALDLHVSQQRLIYQTIVDIRKEAFDKRVAM IAKKAHYLL

ORF3_(—)9VSP is a cell wall surface protein. An example of an amino acidsequence of ORF3_(—)9VSP is set forth in SEQ ID NO: 209. SEQ ID NO: 209MKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTNGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQRTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRIYQVNNLDDNQYGIELTVSGKTVYERKDKSVPLDVVILLDNSNSMSNIRNKNARRAERAGEATRSLIDKITSDPENRVALVTYASTIFDGTEFTVEKGVADKNGKRLNDSLFWNYDQTSFTTNTKDYSYLKLTNDKNDIVELKNKVPTEAEDHDGNRLMYQFGATFTQKALMKADEILTQQARQNSQKVIFHITDGVPTMSYPINFNHATFAPSYQNQLNAFFSKSPNKDGILLSDFITQATSGEHTIVRGDGQSYQMFTDKTVYEKGAPAAFPVKPEKYSEMKAVGYAVIGDPINGGYIWLNWRESILAYPFNSNTAKITNHGDPTRWYYNGNIAPDGYDVFTVGIGINGDPGTDEATATSFMQSISSKPENYTNVTDTTKILEQLNRYFHTIVTEKKSIENGTITDPMGELIDLQLGTDGRFDPADYTLTANDGSRLENGQAVGGPQNDGGLLKNAKVFYDTTEKRIRVTGLYLGTGEKVTLTYNVRLNDQFVSNKFYDTNGRTTLHPKEVEKNTVRDFPIPKIRDVRKYPAITIAKEKKLGEIEFIKINKNDKKPLRDAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVISTVPQDIPAGYEFTNDKHYITNEPIPPKREYPRTGGIGMLLFYLIGCMMMGGVLLYTRKHP

ORF4_(—)9VSP is a cell wall surface protein. An example of an amino acidsequence of ORF4_(—)9VSP is set forth in SEQ ID NO: 210. SEQ ID NO: 210MKSINKFLTMLAALLLTASSLFSAATVFAAGTTTTSVTVHKLLATDGDMDKIANELETGNYAGNKVGVLPANAKEIAGVMFVWTNTNNEIIDENGQTLGVNIDPQTFKLSGAMPATAMKKLTEAEGAKFNTANLPAAKYKIYEIHSLSTYVGEDGATLTGSKAVPIEIELPLNDVVDAHVYPKNTEAKPKIDKDFKGKANPDTPRVDKDTPVNHQVGDVVEYEIVTKIPALANYATANWSDRMTEGLAFNKGTVKVTVDDVALEAGDYALTEVATGFDLKLTDAGLAKVNDQNAEKTVKITYSATLNDKAIVEVPESNDVTFNYGNNPDHGNTPKPNKPNENGDLTLTKTWVDATGAPIPAGAEATFDLVNAQTGDVVQTVTLTTDKNTVTVNGLDKNTEYKFVERSIKGYSADYQEITTAGEIAVKNWKDENPKPLDPTEPKVVTYGKKFVKVNDKDNRLAGAEFVIANADNAGQYLARKADKVSQEEKQLVVTTKDALDRAVAAYNALTAQQQTQQEKEKVDKAQAAYNAAVTAANNAFEWVADKDNENVVKLVSDAQGRFEITGLLAGTYYLEETKQPAGYALLTSRQKFEVTATSYSATGQGIEYTAGSGKDDATKVVNKKITIPQTGGIGTIIFAVAGAVIMGIA VYAYVKNNKDEDQLA

ORF5_(—)9VSP is a cell wall surface protein. An example of an amino acidsequence of ORF5_(—)9VSP is set forth in SEQ ID NO: 211. SEQ ID NO: 211MTMQKMQKMQKMQKMQKMQKMISRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDNRVQTVRDLHSWDENKLSSFKKTSFEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKADTVTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIVVTNLPLGTYRFKEVEPLAGYTVTTMDTDVQLVDHQLVTITVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEENGHYTPVLQNGKEVVVASGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRIDVPDTGEETLYILMLVAILL FGSGYYLTKKTNN

ORF6_(—)9VSP is a putative sortase. An example of an amino acid sequenceof ORF6_(—)9VSP is set forth in SEQ ID NO: 212. SEQ ID NO: 212MLIKMAKTKKQKRNNLLLGVVFFIGIAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVEIPAIDVDLPVYAGTAEEVLQQGAGHLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKRQSERALKALKEATKEVKVE DE

ORF7_(—)9VSP is a putative sortase. An example of an amino acid sequenceof ORF7_(—)9VSP is set forth in SEQ ID NO: 213. SEQ ID NO: 213MSKSRYSRKKSVKKKKNPFILLLIFLVGLAVANYPLVSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFTEQEKKKGVSEYANMLKVHERTGYVEIPAIDQEIPMYVGTSEEILQKGAGLLEGASLPVGGENTHTVVTAHRGLPTAELFSQLDKMKKGDIEYLHVLDQVLAYQVDQIVTVEPNDFEPVLIQHGEDYATLLTCTPYMINSHRLLVRGKRIPYTAPIAERNRAVRERGQFWLWLLLGAMAVILLLLYRVYRNRRIVKGLEKQLEGRHVKD

ORF8_(—)9VSP is a putative sortase. An example of an amino acid sequenceof ORF8_(—)9VSP is set forth in SEQ ID NO: 214. SEQ ID NO: 214MSRTKLRALLGYLLMLVACLIPIYGFGQMVLQSLGQVKGHATFVKSMTTEMYQEQQNHSLAYNQRLASQNRIVDPELAEGYEVNYQVSDDPDAVYGYLSIPSLEIMEPVYLGADYHHLGMGLAHVDGTPLPLDGTGIRSVIAGHPAEPSHVFTRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNEERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFLGILFVLWKLARLLRGK

As discussed above, a S. pneumoniae AI sequence is present in the 14 CSR10 S. pneumoniae genome. Examples of S. pneumoniae AI sequences from 14CSR 10 are set forth below.

ORF2_(—)14CSR is a transcriptional regulator. An example of an aminoacid sequence of ORF2_(—)14CSR is set forth in SEQ ID NO: 215. SEQ IDNO: 215 MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSILQELQETFEEELTFNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKTLRFFLLQGNQSFNEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGIEIYDLNDGSMDWVTHMIVQSNSQLSHELLEITPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTTQLILQHTRGKHLLSKFKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHTSKATVQEWMTEQKIEGVTDQHRLYLFSLYLTETIFSSLPAIPIFIILNNQADVNLIKSIILRNFTDKVASVTGYNILISPPPSEEHLTEPLIIITTKEYLPYVKKQYPKGKHHFLTIALDLHVSQQRLIYQTIVDIRKEAFDKRVAM IAKKAHYLL

ORF3_(—)14CSR is a cell wall surface protein. An example of an aminoacid sequence of ORF3_(—)14CSR is set forth in SEQ ID NO: 216. SEQ IDNO: 216 MKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRIYQVNNLDDNQYGIELTVSGKTTVETKEASTPLDVVTLLDNSNSMSNIRHNHAHRAEKAGEATRALVDKITSNPDNRVALVTYGSTIFDGSEATVEKGVADANGKILNDSALWTFDRTTFTAKTYNYSFLNLTSDPTDIQTIKDRIPSDAEELNKDKLMYQFGATFTQKALMTADDILTKQARPNSKKVIFHITDGVPTMSYPINFKYTGTTQSYRTQLNNFKAKTPNSSGILLEDFVTWSADGEHKIVRGDGESYQMFTKKPVTDQYGVHQILSITSMEQRAKLVSAGYRFYGTDLYLYWRDSILAYPFNSSTDWITNHGDPTTWYYNGNMAQDGYDVFTVGVGVNGDPGTDEARARTFMQSISSSPDNYTNVADPSQILQELNRYFYTIVNEKKSIENGTITDPMGELIDFQLGADGRFDPADYTLTANDGSSLVNNVPTGGPQNDGGLLKNAKVFYDTTEKRIRVTGLYLGTGEKVTLTYNVRLNDQFVSNKFYDTNGRTTLHPKEVEKNTVRDFPIPKIRDVRKYPEITIPKEKKLGEIEFIKINKNDKKPLRDAVPSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTEKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEFTNDKHYITNEPTPPKREYPRTGGIGMLPFYLTGCMMMGGVLLYTRKHP

ORF4_(—)14CSR is a cell wall surface protein. An example of an aminoacid sequence of ORF4_(—)14CSR is set forth in SEQ ID NO: 217. SEQ IDNO: 217 MKSINKFLTMLAALLLTASSLFSAATVFAADNVSTAPDAVTKTLTIHKLLLSEDDLKTWDTNGPKGYDGTQSSLKDLTGVVAEEIPNVYFELQKYNLTDGKEKENLKDDSKWTTVHGGLTTKDGLKIETSTLKGVYRIREDRTKTTYVGPNGQVLTGSKAVPALVTLPLVNNNGTVIDAHVFPKNSYNKPVVDKRTADTLNYNDQNGLSIGTKIPYVVNTTIPSNATFATSFWSDEMTEGLTYNEDVTITLNNVAMDQADYEVTKGNNGFNLKLTEAGLAKINGKDADQKIQITYSATLNSLAVADIPESNDITYHYGNHQDHGNTPKPTKPNNGQITVTKTWDSQPAPEGVKATVQLVNAKTGEKVGAPVELSENNWTYTWSGLDNSIEYKVEEEYNGYSAEYTVESKGKLGVKNWKDNNPAPINPEEPRVKTYGKKFVKVDQKDTRLENAQFVVKKADSNKYIAFKSTAQQAADEKAAATAKQKLDAAVAAYTNAADKQAAQALVDQAQQEYNVAYKEAKFGYVEVAGKDEAMVLTSNTDGQFQISGLAAGTYKLEEIKAPEGFAKIDDVEFVVGAGSWNQGEFNYLKDVQKNDATKVVNKKITIPQTGGIGTIIFAVAGAAIMGIAVYAYVKNNKDEDQLA

ORF5_(—)14CSR is a cell wall surface protein. An example of an aminoacid sequence of ORF5_(—)14CSR is set forth in SEQ ID NO: 218. SEQ IDNO: 218 MTMQKMQKMISRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDDRVQIVRDLHSWDENKLSSFKKTSFEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKTDTMTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGETFVTNLPLGNYRFKEVEPLAGYAVTTLDTDVQLVDHQLVTITVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEESGHYTPVLQNGKEVVVTSGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRIDVPDTGEETLYILMLVAILLFGSGYYLTKKPN N

ORF6_(—)14CSR is a putative sortase. An example of an amino acidsequence of ORF6_(—)14CSR is set forth in SEQ ID NO: 219. SEQ ID NO: 219MLIKMVKTKKQKRNNLLLGVVFFIGMAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVEIPVIDVDLPVYAGTAEEVLQQGAGHLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKKQPEKALKALKAARKEVKVE DGQQ

ORF7_(—)14CSR is a putative sortase. An example of an amino acidsequence of ORF7_(—)14CSR is set forth in SEQ ID NO: 220. SEQ ID NO: 220MDNSRRSRKKGTKKKKHPLILLLIFLVGFAVAIYPLVSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFTEQEKKKGVSEYANMLKVHERIGYVEIPAIDQEIPMYVGTSEEILQKGAGLLEGASLPVGGENTHTVVTAHRGLPTAELFSQLDKMKKGDVFYLHVLDQVLAYQVDQILTVEPNDFEPVLIQHGEDYATLLTCTPYMINSHRLLVRGKRIPYTAPIAERNRAVRERGQEWLWLLLAALVMILVLSYGVYRHRRIVKGLEKQLEEHHVKG

ORF8_(—)14CSR is a putative sortase. An example of an amino acidsequence of ORF8_(—)14CSR is set forth in SEQ ID NO: 221. SEQ ID NO: 221MSKAKLQKLLGYLLMLVALVIPVYCFGQMVLQSLGQVKGHEIFSESVTADSYQEQLQRSLDYNQRLDSQNRIVDPFLAEGYEVNYQVSDDPDAVYGYLSIPSLEIMEPVYLGADYHHLAMGLAHVDGTPLPVEGKGIRSVIAGHPAEPSHVFFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTPNKRLLVNFERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFLGTLFVLWKLARLLRGK

As discussed above, a S. pneumoniae AI sequence is present in the 19FTaiwan 14 S. pneumoniae genome. Examples of S. pneumoniae AI sequencesfrom 19F Taiwan 14 are set forth below.

ORF2_(—)19FTW is a transcriptional regulator. An example of an aminoacid sequence of ORF2_(—)19FTW is set forth in SEQ ID NO: 222. SEQ IDNO: 222 MLNKYIEKRITDKITILNILLDIRSTELDELSTLTSLQSKSLLSILQELQETFEEELTFNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKTLRFFLLQGNQSENEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLTALLQFHFGTEIYDLNDGSMDWVTHMIVQSNSQLSHELLEITPDEYVHESTLVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKFKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKAIVQEWMTEQKIEGVIDQHRLYLFSLYLTETIFSSLPAIPIFIILNNQADVNLIKSIILRNFTDKVASVTGYNILTSPPPSEEHLTEPLIIITTKEYLPYVKKQYPKGKHHFLTIALDLHVSQQRLIYQTIVDIRKEAFDKRVAM IAKKAHYLL

ORF3_(—)19FTW is a cell wall surface protein. An example of an aminoacid sequence of ORF3_(—)19FTW is set forth in SEQ ID NO: 223. SEQ IDNO: 223 MKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIESNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVTPEGTLSKRIYQVNNLDDNQYGIELTVSGKTVYERKDKSVPLDVVILLDNSNSMSNIRNKNARRAERAGEATRSLIDKITSDPENRVALVTYASTIPDGTEFTVEKGVADKNGKRLNDSLFWNYDQTSFTTNTKDYSYLKLTNDKNDIVELKNKVPTEAEDHDGNRLMYQFGATFTQKALMKADEILTQQARQNSQKVIFHITDGVPTMSYPINFNHATFAPSYQNQLNAFFSKSPNKDGILLSDFITQATSGEHTIVRGDGQSYQMFTDKTVYEKGAPAAFPVKPEKYSEMKAVGYAVIGDPINGGYIWLNWRESILAYPFNSNTAKITNHGAPTRWYYNGNIAPDGYDVFTVGIGINGDPGTDEATATSFMQSISSKPENYTNVTDTTKILEQLNRYFHTIVTEKKSIENGTITDPMGELIDLQLGTDGRFDPADYTLTANDGSRLENGQAVGGPQNDGGLLKNAKVFYDTTEKRIRVTGLYLGTGEKVTLTYNVRLNDQEVSNKFYDTNGRTTLHPKEVEKNTVRDFPIPKIRDVRKYPAITIAKEKKLGEIEFIKINKNDKKPLRDAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEFTNDKHYITNEPIPPKREYPRTGGIGMLPFYLIGCMMMGGVLLYTRKHP

ORF4_(—)19FTW is a cell wall surface protein. An example of an aminoacid sequence of ORF4_(—)19FTW is set forth in SEQ ID NO: 224. SEQ IDNO: 224 MKSINKFLTMLAALLLTASSLFSAATVFAAGTTTTSVTVHKLLATDGDMDKIANELETGNYAGNKVGVLPANAKEIAGVMFVWTNTNNEIIDENGQTLGVNIDPQTFKLSGAMPATAMKKLTEAEGAKFNTANLPAAKYKIYEIHSLSTYVGEDGATLTGSKAVPIEIELPLNDVVDAHVYPKNTEAKPKIDKDFKGKANPDTPRVDKDTPVNHQVGDVVEYEIVTKIPALANYATANWSDRMTEGLAFNKGTVKVTVDDVALEAGDYALTEVATGFDLKLTDAGLAKVNDQNAEKTVKITYSATLNDKAIVEVPESNDVTFNYGNNPDHGNTPKPNKPNENGDLTLTKTWVDATGAPIPAGAEATFDLVNAQTGKVVQTVTLTTDKNTVTVNGLDKNTEYKFVERSIKGYSADYQEITTAGEIAVKNWKDENPKPLDPTEPKVVTYGKKFVKVNDKDNRLAGAEFVTANADNAGQYLARKADKVSQEEKQLVVTTKDALDRAVAAYNALTAQQQTQQEKEKVDKAQAAYNAAVIAANNAFEWVADKDNENVVKLVSDAQGRFEITGLLAGTYYLEETKQPAGYALLTSRQKFEVTATSYSATGQGIEYTAGSGKDDATKVVNKKITIPQTGGIGTIIFAVAGAVIMGIA VYAYVKNNKDEDQLA

ORF5_(—)19FTW is a cell wall surface protein. An example of an aminoacid sequence of ORF5_(—)19FTW is set forth in SEQ ID NO: 225. SEQ IDNO: 225 MTMQKMQKMTSRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDNRVQIVRDLHSWDENKLSSFKKTSFEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKADTVTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIVVTNLPLGTYRFKEVEPLAGYTVTTMDTDVQLVDHQLVTITVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEENGHYTPVLQNGKEVVVASGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRIDVPDTGEETLYILMLVAILLFGSGYYLTKKTN N

ORF6_(—)19FTW is a putative sortase. An example of an amino acidsequence of ORF6_(—)19FTW is set forth in SEQ ID NO: 226. SEQ ID NO: 226MLIKMAKTKKQKRNNLLLGVVFFIGMAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADTDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVEIPAIDVDLPVYAGTAEEVLQQGAGHLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKRQSERALKALKEATKEVKVE DE

ORF7_(—)19FTW is a putative sortase. An example of an amino acidsequence of ORF7_(—)19FTW is set forth in SEQ ID NO: 227. SEQ ID NO: 227MSKSRYSRKKSVKKKKNPFILLLIFLVGLAVAMYPLVSRYYYRIESNEVTKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFTDQEKKQGVSEYANMLKVHERIGYVEIPAIEQEIPMYVGTSEDILQKGAGLLEGASLPVGGENTHTVITAHRGLPTAELFSQLDKMKKGDIFYLHVLDQVLAYQVDQIVTVEPNDFEPVLIQHGQDYATLLTCTPYMLNSHRLLVRGKRTPYTAPIAERNRAVRERGQFWLWLLLGAMAVILLLLYRVYRNRRTVKGLEKQLEGRHVKD

ORF8_(—)19FTW is a putative sortase. An example of an amino acidsequence of ORF8_(—)19FTW is set forth in SEQ ID NO: 228. SEQ ID NO: 228MSRTKLRALLGYLLMLVACLIPIYCFGQMVLQSLGQVKGHATFVKSMTTEMYQEQQNHSLAYNQRLASQNRIVDPFLAEGYEVNYQVSDDPDAVYGYLSIPSLEIMEPVYLGADYHHLGMGLAHVDGTPLPLDGTGIRSVIAGHRAEPSHVFFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNFERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFLGILFVLWKLARLLRGK

As discussed above, a S. pneumoniae AI sequence is present in the 23FTaiwan 15 S. pneumoniae genome. Examples of S. pneumoniae AI sequencesfrom 23F Taiwan 15 are set forth below.

ORF2_(—)23FTW is a transcriptional regulator. An example of an aminoacid sequence of ORF2_(—)23FTW is set forth in SEQ ID NO: 229. SEQ IDNO: 229 MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSILQELQETFEEELTPNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKILRFFLLQGNQSFNEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGIEIYDLNDGSMDWVTHMIVQSNSQLSHELLETTPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKFKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKAIVQEWMTEQKIEGVIDQHRLYLFSLYLTETIFSSLPAIPIFIILNNQADVNLIKSIILRNFTDKVASVTGYNILISPPPSEEHLTEPLITITTKEYLPYVKKQYPKGKHHFLTIALDLHVSQQRLIYQTIVDIRKEAFDKRVAM IAKKAHYLL

ORF3_(—)23FTW is a cell wall surface protein. An example of an aminoacid sequence of ORF3_(—)23FTW is set forth in SEQ ID NO: 230. SEQ IDNO: 230 MKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRIYQVNNLDDNQYGIELTVSGKTVYEQKDKSVPLDVVILLDNSNSMSNIRNKNARRAERAGEATRSLIDKITSDPENRVALVTYASTIFDGTEFTVEKGVADKNGKRLNDSLFWNYDQTSFTTNTKDYSYLKLTNDKNDIVELKNKVPTFAEDHDGNRLMYQFGATFTQKALMKADEILTQQARQNSQKVIFHITDGVPTMSYPINFNHATFAPSYQNQLNAETSKSPNKDGILLSDFITQATSGEHTIVRGDGQSYQMFTDKTVYEKGAPAAFPVKPEKYSEMKAAGYAVIGDPINGGYIWLNWRESILAYPFNSNTAKITNHGDPTRWYYNGNIAPDGYDVFTVGIGINGDPGTDEATATSFMQSISSKPENYTNVTDTTKILEQLNRYFHTIVTEKKSIENGTITDPMGELIDLWLGTDGRFDPADYTLTANDGSRLENGQAVGGPQNDGGLLKNAKVLYDTTEKRIRVTGLYLGTKEKVTLTYNVRLNDEFVSNKFYDTNGRTTLHPKEVEQNTVRDFPIPKIRDVRKYPEITISKEKKLGDIEFIKVNKNDKKPLRDAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEFTNDKHYITNEPIPPKREYPRTGGIGMLPFYLIGCMMMGGVLLYTRKHP

ORF4_(—)23FTW is a cell wall surface protein. An example of an aminoacid sequence of ORF4_(—)23FTW is set forth in SEQ ID NO: 231. SEQ IDNO: 231 MKSINKFLTILAALLLTVSSLFSAATVFAAEQKTKTLTVHKLLMTDQELDAWNSDAITTAGYDGSQNFEQFKQLQGVPQGVTEISGVAFELQSYTGPQGKEQENLTNDAVWTAVNKGVTTETGVKFDTEVLQGTYRLVEVRKESTYVGPNGKVLTGMKAVPALITLPLVNQNGVVENAHVYPKNSEDKPTATKTFDTAAGFVDPGEKGLAIGTKVPYIVTTTIPKNSTLATAPWSDEMTEGLDYNGDVVVNYNGQPLDNSHYTLEAGHNGFILKLNEKGLEAINGKDAEATITLKYTATLNALAVADVPEANDVTFHYGNNPGHGNTPKPNKPKNGELTITKTWADAKDAPIAGVEVTFDLVNAQTGEVVKVPGHETGIVLNQTNNWTFTATGLDNNTEYKFVERTIKGYSADYQTITETGKIAVKNWKDENPEPINPEEPRVKTYGKKFVKVDQKDERLKEAQFVVKNEQGKYLALKSAAQQAVNEKAAAEAKQALDAAIAAYTNAADKNAAQAVVDAAQKTYNDNYRAARFGYVEVERKEDALVLTSNTDGQFQISGLAAGSYTLEETKAPEGFAKLGDVKFEVGAGSWNQGDFNYLKDVQKNDATKVVNKKITIPQTGGIGTIIFAVAGAVIMGIAVYAYVKNNKDE DQLA

ORF5_(—)23FTW is a cell wall surface protein. An example of an aminoacid sequence of ORF5_(—)23FTW is set forth in SEQ ID NO: 232. SEQ IDNO: 232 MTMQKMQKMISRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDNRVQIVRDLHSWDENKLSSFKKTSEEMTFLENQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKADTVTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIVVTNLPLGTYRFKEVEPLAGYTVTTMDTDVQLVDHQLVTITVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEENGHYTPVLQNGKEVVVASGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRIDVPDTGEETLYILMLVAILLFGSGYYLTKKTN N

ORF6_(—)23FTW is a putative sortase. An example of an amino acidsequence of ORF6_(—)23FTW is set forth in SEQ ID NO: 233. SEQ ID NO: 233MLIKMVKTKKQKRNNLLLGVVFFIGMAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVEIPVIDVDLPVYAGTAEEVLQQGAGQLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKKQPEKALKALKAARKEVKVE DGQQ

ORF7_(—)23FTW is a putative sortase. An example of an amino acidsequence of ORF7_(—)23FTW is set forth in SEQ ID NO: 234. SEQ ID NO: 234MDNSRRSRKKGTKKKKHPLILLLIFLVGFAVAIYPLVSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFThQEKKKGVSEYANMLKVHERIGYVEIPAIDQEIPMYVGTSEEILQKGAGLLEGASLPVGGENTHTVVTAHRGLPTAELFSQLDKMKKGDVFYLHVLDQVLAYQVDQILTVEPNDFEPVLIQHGKDYATLLTCTPYMINSHRLLVRGKRIPYTAPIAERNRAVRERGQFWLWLLLAALVMILVLSYGVYRHRRIVKGLEKQLEEHHVKG

ORF8_(—)23FTW is a putative sortase. An example of an amino acidsequence of ORF8_(—)23FTW is set forth in SEQ ID NO: 235. SEQ ID NO: 235MSKAKLQKLLGYLLMLVALVIPVYCFGQMVLQSLGQVKGHETFSESVTADSYQEQLQRSLDYNQRLDSQNRIVDPFLAEGYEVNYQVSDDPDAVYGYLSIPSLEIMEPVYLGADYHHLAMGLAHVDGTPLPVEGKGTRSVIAGHPAEPSHVFFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNFERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFLGILFVLWKLARLLRGK

As discussed above, a S. pneumoniae AI sequence is present in the 23FPoland 16 S. pneumoniae genome. Examples of S. pneumoniae AI sequencesfrom 23F Poland 16 are set forth below.

ORF2_(—)23FP is a transcriptional regulator. An example of an amino acidsequence of ORF2_(—)23FP is set forth in SEQ ID NO: 236. +TR, SEQ ID NO:236 MLNKYIEKRITDKITILNILLDIRSIELDELSTLTSLQSKSLLSILQELQETFEEELTFNLDTQQVQLIEHHSHQTNYYFHQLYNQSTILKILRFFLLQGNQSFNEFTQKEYISIATGYRVRQKCGLLLRSVGLDLVKNQVVGPEYRIRFLIALLQFHFGIEIYDLNDGSMDWVTHMIVQSNSQLSHELLEITPDEYVHFSILVALTWKRREFPLEFPESKEFEKLKNLFMYPILMEHCQTYLEPHANMTFTQEELDYIFLVYCSANSSFSKDKWNQEKKTHTIQLILQHTRGKHLLSKFKNILGNDISNSLSFLTALTFLTRTFLFGLQNLVPYYNYYEHYGIESDKPLYHISKATVQEWMTEQKIEGVIDQHRLYLFSLYLTETIFSSLPAIPIFIILNNQADVNLIKSIILRNFTDKVASVTGYNILISPPPSEEHLTEPLIIITTKEYLPYVKKQYPKGKHHELTIALDLHVSQQRLIYQTIVDIRKEAFDKRVAN IAKKAHYLL

ORF3_(—)23FP is a cell wall surface protein. An example of an amino acidsequence of ORF3_(—)23FP is set forth in SEQ ID NO: 237. SEQ ID NO: 237MKKVRKIFQKAVAGLCCISQLTAFSSIVALAETPETSPAIGKVVIKETGEGGALLGDAVFELKNNTDGTTVSQRTEAQTGEAIFSNIKPGTYTLTEAQPPVGYKPSTKQWTVEVEKNGRTTVQGEQVENREEALSDQYPQTGTYPDVQTPYQIIKVDGSEKNGQHKALNPNPYERVIPEGTLSKRIYQVNNLDDNQYGIELTVSGKTTVETKEASTPLDVVTLLDNSNSMSNIRHNHAHRAEKAGEATRALVDKITSNPDNRVALVTYGSTIFDGSEATVEKGVADANGKTLNDSALWTEDRTTFTAKTYNYSFLNLTSDPTDIQTIKDRIPSDAEELNKDKLMYQFGATFTQKALMTADDILTKQARPNSKKVIFHITDGVPTMSYPINFKYTGTTQSYRTQLNNFKAKTPNSSGILLEDFVTWSADGEHKIVRGDGESYQMFTKKPVTDQYGVHQILSITSMEQRAKLVSAGYRFYGTDLYLYWRDSILAYPFNSSTDWITNHGDPTTWYYNGNMAQDGYDVFTVGVGVNGDPGTDEATATRFMQSISSSPDNYTNVADPSQILQELNRYFYTIVNEKKSIENGTITDPMGELIDFQLGADGRFDPADYTLTANDGSSLVNNVPTGGPQNDGGLLKNAKVFYDTTEKRIRVTGLYLGTGEKVTLTYNVRLNDQFVSNKFYDTNGRTTLHPKEVEKNTVRDFPIPKIRDVRKYPEITIPKEKKLGEIEFIKINKNDKKPLRDAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQIVNGEVRDVTSIVPQDIPAGYEETNDKHYITNEPIPPKREYPRTGGIGMLPFYLIGCMMMGGVLLYTRKNP

ORF4_(—)23FP is a cell wall surface protein. An example of an amino acidsequence of ORF4_(—)23FP is set forth in SEQ ID NO: 238. SEQ ID NO: 238MKSINKFLTMLAALLLTASSLFSAATVFAADNVSTAPDAVTKTLTIHKLLLSEDDLKTWDTNGPKGYDGTQSSLKDLTGVVAEEIPNVYFELQKYNLTDGKEKENLKDDSKWTTVHGGLTTKDGLKIETSTLKGVYRIREDRTKTTYVGPNGQVLTGSKAVPALVTLPLVNNNGTVIDAHVEPKNSYNKPVVDKRIADTLNYNDQNGLSIGTKIPYVVNTTIPSNATFATSFWSDEMTEGLTYNEDVTITLNNVAMDQADYEVTKGINGFNLKLTEAGLAKINGKDADQKIQITYSATLNSLAVADIPESNDITYHYGNHQDHGNTPKPTKPNNGQITVTKTWDSQPAPEGVKATVQLVNAKTGEKVGAPVELSENNWTYTWSGLDNSIEYKVEEEYNGYSAEYTVESKGKLGVKNWKDNNPAPINLEEPRVKTYGKKFVKVDQKDTRLENAQFVVKKADSNKYIAFKSTAQQAADEKAAATAKQKLDAAVAAYTNAADKQAAQALVDQAQQEYNVAYKEAKFGYVEVAGKDEAMVLTSNTDGQFQISGLAAGTYKLEEIKAPEGFAKIDDVEFVVGAGSWNQGEFNYLKDVQKNDATKVVNKKITIPQTGGIGTIIFAVAGAVIMGIAVYAYVKNNKDEDQLA

ORF5_(—)23FP is a cell wall surface protein. An example of an amino acidsequence of ORF5_(—)23FP is set forth in SEQ ID NO: 239. SEQ ID NO: 239MTMQKMQKMISRIFFVMALCFSLVWGAHAVQAQEDHTLVLQLENYQEVVSQLPSRDGHRLQVWKLDDSYSYDNRVQIVRDLHSWDENKLSSFKKTSFEMTFLSNQIEVSHIPNGLYYVRSIIQTDAVSYPAEFLFEMTDQTVEPLVIVAKKADTVTTKVKLIKVDQDHNRLEGVGFKLVSVARDGSEKEVPLIGEYRYSSSGQVGRTLYTDKNGEIVVTNLPLGTYRFKEVEPLAGYAVTTMDTDVQLVDHQLVTITVVNQKLPRGNVDFMKVDGRTNTSLQGAMFKVMKEENGHYTPVLQNGKEVVVASGKDGRFRVEGLEYGTYYLWELQAPTGYVQLTSPVSFTIGKDTRKELVTVVKNNKRPRTDVPDTGEETLYILMLVAILLFGSGYYLTKKTN N

ORF6_(—)23FP is a putative sortase. An example of an amino acid sequenceof ORF6_(—)23FP is set forth in SEQ ID NO: 240. SEQ ID NO: 240MLIKMAKTKKQKRNNLLLGVVFFIGIAVMAYPLVSRLYYRVESNQQIADFDKEKATLDEADIDERMKLAQAFNDSLNNVVSGDPWSEEMKKKGRAEYARMLEIHERMGHVETPAIDVDLPVYAGTAEEVLQQGAGHLEGTSLPIGGNSTHAVITAHTGLPTAKMFTDLTKLKVGDKFYVHNIKEVMAYQVDQVKVIEPTNFDDLLIVPGHDYVTLLTCTPYMINTHRLLVRGHRIPYVAEVEEEFIAANKLSHLYRYLFYVAVGLIVILLWIIRRLRKKKRQSERALKALKEATKEVKVE DE

ORF7_(—)23FP is a putative sortase. An example of an amino acid sequenceof ORF7_(—)23FP is set forth in SEQ ID NO: 241. SEQ ID NO: 241MSKSRYSRKKSVKKKKNPFILLLIFLVGLAVAMYPLVSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFNATLKPSEILDPFTEQEKKKGVSEYANMLKVHERIGYVEIPAIDQEIPMYVGTSEETLQKGAGLLEGASLPVGGENTHTVVTAHRGLPTAELFSQLDKMKKGDIFYLHVLDQVLAYQVDQIVTVEPNDFEPVLIQHGEDYATLLTCTPYMINSHRLLVRGKRIPYTAPIAERNRAVRERGQFWLWLLLGAMAVILLLLYRVYRNRRIVKGLEKQLEGRHVKD

ORF8_(—)23FP is a putative sortase. An example of an amino acid sequenceof ORF8_(—)23FP is set forth in SEQ ID NO: 242. SEQ ID NO: 242MSKSRYSRKKSVKKKKNPFILLLIFLVGLAVAMYPLVSRYYYRIESNEVIKEFDETVSQMDKAELEERWRLAQAFFLAEGYEVNYQVSDDPDAVYGYLSIPSLEIMEPVYLGADYHHLGMGLAHVDGTPLPLDGTGIRSVIAGHPAEPSHVEFRHLDQLKVGDALYYDNGQEIVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNFERVAVYQKSDPQTAAVARVAFTKEGQSVSRVATSQWLYRGLVVLAFLGILFVLWKLARLLRGK

Immunogenic compositions of the invention comprising AI antigens mayfurther comprise one or more antigenic agents. Preferred antigensinclude those listed below. Additionally, the compositions of thepresent invention may be used to treat or prevent infections caused byany of the below-listed microbes. Antigens for use in the immunogeniccompositions include, but are not limited to, one or more of thefollowing set forth below, or antigens derived from one or more of thefollowing set forth below:

Bacterial Antigens

N. meningitides: a protein antigen from N. meningitides serogroup A, C,W135, Y, and/or B (1-7); an outer-membrane vesicle (OMV) preparationfrom N. meningitides serogroup B. (8, 9, 10, 11); a saccharide antigen,including LPS, from N. meningitides serogroup A, B, C W135 and/or Y,such as the oligosaccharide from serogroup C (see PCT/US99/09346; PCTIB98/01665; and PCT IB99/00103);

Streptococcus pneumoniae: a saccharide or protein antigen, particularlya saccharide from Streptooccus pneumoniae;

Streptococcus agalactiae: particularly, Group B streptococcus antigens;

Streptococcus pyogenes: particularly, Group A streptococcus antigens;

Enterococcus faecalis or Enterococcus faecium: Particularly atrisaccharide repeat or other Enterococcus derived antigens provided inU.S. Pat. No. 6,756,361;

Helicobacter pylori. including: Cag, Vac, Nap, HopX, HopY and/or ureaseantigen;

Bordetella pertussis: such as petussis holotoxin (PT) and filamentoushaemagglutinin (FHA) from B. pertussis, optionally also combination withpertactin and/or agglutinogens 2 and 3 antigen;

Staphylococcus aureus: including S. aureus type 5 and 8 capsularpolysaccharides optionally conjugated to nontoxic recombinantPseudomonas aeruginosa exotoxin A, such as StaphVAX™, or antigensderived from surface proteins, invasins (leukocidin, kinases,hyaluronidase), surface factors that inhibit phagocytic engulfment(capsule, Protein A), carotenoids, catalase production, Protein A,coagulase, clotting factor, and/or membrane-damaging toxins (optionallydetoxified) that lyse eukaryotic cell membranes (hemolysins, leukotoxin,leukocidin);

Staphylococcus epidermis: particularly, S. epidermidis slime-associatedantigen (SAA);

Staphylococcus saprophyticus: (causing urinary tract infections)particularly the 160 kDa hemagglutinin of S. saprophyticus antigen;

Pseudomonas aeruginosa. particularly, endotoxin A, Wzz protein, P.aeruginosa LPS, more particularly LPS isolated from PAO1 (O5 serotype),and/or Outer Membrane Proteins, including Outer Membrane Proteins F(OprF) (Infect Immun. 2001 May; 69(5): 3510-3515);

Bacillus anthracis (anthrax): such as B. anthracis antigens (optionallydetoxified) from A-components (lethal factor (LF) and edema factor(EF)), both of which can share a common B-component known as protectiveantigen (PA);

Moraxella catarrhalis. (respiratory) including outer membrane proteinantigens (HMW-OMP), C-antigen, and/or LPS;

Yersinia pestis (plague): such as F1 capsular antigen (Infect Immun.2003 January; 71(1)): 374-383, LPS (Infect Immun. 1999 October; 67(10):5395), Yersinia pestis V antigen (Infect Immun. 1997 November; 65(11):4476-4482);

Yersinia enterocolitica (gastrointestinal pathogen): particularly LPS(Infect Immun. 2002 August; 70(8): 4414);

Yersinia pseudotuberculosis: gastrointestinal pathogen antigens;

Mycobacterium tuberculosis: such as lipoproteins, LPS, BCG antigens, afusion protein of antigen 85B (Ag85B) and/or ESAT-6 optionallyformulated in cationic lipid vesicles (Infect Immun. 2004 October;72(10): 6148), Mycobacterium tuberculosis (Mtb) isocitrate dehydrogenaseassociated antigens (Proc Natl Acad Sci USA. 2004 Aug. 24; 101(34):12652), and/or MPT51 antigens (Infect Immun. 2004 July; 72(7): 3829);

Legionella pneumophila (Legionnairs' Disease): L. pneumophilaantigens—optionally derived from cell lines with disrupted asd genes(Infect Immun. 1998 May; 66(5): 1898);

Rickettsia: including outer membrane proteins, including the outermembrane protein A and/or B (OmpB) (Biochim Biophys Acta. 2004 Nov. 1;1702(2): 145), LPS, and surface protein antigen (SPA) (J Autoimmun. 1989June; 2 Suppl:81);

E. coli: including antigens from enterotoxigenic E. coli (ETEC),enteroaggregative E. coli (EAggEC), diffusely adhering E. coli (DAEC),enteropathogenic E. coli (EPEC), and/or enterohemorrhagic E. coli(EHEC);

Vibrio cholerae: including proteinase antigens, LPS, particularlylipopolysaccharides of Vibrio cholerae II, O1 Inaba O-specificpolysaccharides, V. cholera O139, antigens of IEM108 vaccine (InfectImmun. 2003 October; 71(10):5498-504), and/or Zonula occludens toxin(Zot);

Salmonella typhi (typhoid fever): including capsular polysaccharidespreferably conjugates (Vi, i.e. vax-TyVi);

Salmonella typhimurium (gastroenteritis): antigens derived therefrom arecontemplated for microbial and cancer therapies, including angiogenesisinhibition and modulation of flk;

Listeria monocytogenes (sytemic infections in immunocompromised orelderly people, infections of fetus): antigens derived from L.monocytogenes are preferably used as carriers/vectors forintracytoplasmic delivery of conjugates/associated compositions of thepresent invention;

Porphyromonas gingivalis: particularly, P. gingivalis outer membraneprotein (OMP);

Tetanus: such as tetanus toxoid (TT) antigens, preferably used as acarrier protein in conjunction/conjugated with the compositions of thepresent invention;

Diphtheria: such as a diphtheria toxoid, preferably CRM₁₉₇, additionallyantigens capable of modulating, inhibiting or associated with ADPribosylation are contemplated forcombination/co-administration/conjugation with the compositions of thepresent invention, the diphtheria toxoids are preferably used as carrierproteins;

Borrelia burgdorferi (Lyme disease): such as antigens associated withP39 and P13 (an integral membrane protein, Infect Immun. 2001 May;69(5): 3323-3334), VlsE Antigenic Variation Protein (J Clin Microbiol.1999 December; 37(12): 3997);

Haemophilis influenzae B: such as a saccharide antigen therefrom;

Klebsiella: such as an OMP, including OMP A, or a polysaccharideoptionally conjugated to tetanus toxoid;

Neiserria gonorrhoeae: including, a Por (or porin) protein, such as PorB(see Zhu et al., Vaccine (2004) 22:660-669), a transferring bindingprotein, such as ThpA and TbpB (See Price et al., Infection and Immunity(2004) 71(1):277-283), a opacity protein (such as Opa), areduction-modifiable protein (Rmp), and outer membrane vesicle (OMV)preparations (see Plante et al., J Infectious Disease (2000)182:848-855), also see e.g. WO99/24578, WO99/36544, WO99/57280,WO02/079243);

Chlamydia pneumoniae. particularly C. pneumoniae protein antigens;

Chlamydia trachomatis: including antigens derived from serotypes A, B,Ba and C are (agents of trachoma, a cause of blindness), serotypes L₁,L₂ & L₃ (associated with Lymphogranuloma venereum), and serotypes, D-K;

Treponema pallidum (Syphilis): particularly a TmpA antigen; and

Haemophilus ducreyi (causing chancroid): including outer membraneprotein (DsrA).

Where not specifically referenced, further bacterial antigens of theinvention may be capsular antigens, polysaccharide antigens or proteinantigens of any of the above. Further bacterial antigens may alsoinclude an outer membrane vesicle (OMV) preparation. Additionally,antigens include live, attenuated, split, and/or purified versions ofany of the aforementioned bacteria. The bacterial or microbial derivedantigens of the present invention may be gram-negative or gram-positiveand aerobic or anaerobic.

Additionally, any of the above bacterial-derived saccharides(polysaccharides, LPS, LOS or oligosaccharides) can be conjugated toanother agent or antigen, such as a carrier protein (for exampleCRM₁₉₇). Such conjugation may be direct conjugation effected byreductive amination of carbonyl moieties on the saccharide to aminogroups on the protein, as provided in U.S. Pat. No. 5,360,897 and Can JBiochem Cell Biol. 1984 May;62(5):270-5. Alternatively, the saccharidescan be conjugated through a linker, such as, with succinamide or otherlinkages provided in Bioconjugate Techniques, 1996 and CRC, Chemistry ofProtein Conjugation and Cross-Linking, 1993.

Viral Antigens

Influenza: including whole viral particles (attenuated), split, orsubunit comprising hemagglutinin (HA) and/or neuraminidase (NA) surfaceproteins, the influenza antigens may be derived from chicken embryos orpropogated on cell culture, and/or the influenza antigens are derivedfrom influenza type A, B, and/or C, among others;

Respiratory syncytial virus (RSV): including the F protein of the A2strain of RSV (J Gen Virol. 2004 November; 85(Pt 11):3229) and/or Gglycoprotein;

Parainfluenza virus (PIV): including PIV type 1, 2, and 3, preferablycontaining hemagglutinin, neuraminidase and/or fusion glycoproteins;

Poliovirus: including antigens from a family of picornaviridae,preferably poliovirus antigens such as OPV or, preferably IPV;

Measles: including split measles virus (MV) antigen optionally combinedwith the Protollin and or antigens present in MMR vaccine;

Mumps: including antigens present in MMR vaccine;

Rubella: including antigens present in MMR vaccine as well as otherantigens from Togaviridae, including dengue virus;

Rabies: such as lyophilized inactivated virus (RabAvert™);

Flaviviridae viruses: such as (and antigens derived therefrom) yelowfever virus, Japanese encephalitis virus, dengue virus (types 1, 2, 3,or 4), tick borne encephalitis virus, and West Nile virus;

Caliciviridae; antigens therefrom;

HIV: including HIV-1 or HIV-2 strain antigens, such as gag (p24gag andp55gag), env (gp160 and gp41), pol, tat, nef, rev vpu, miniproteins,(preferably p55 gag and gp140v delete) and antigens from the isolatesHIV_(IIb), HIV_(SF2), HIV_(LAV), HIV_(LA1), HUV_(MN), HIV-1_(CM235),HIV-1_(US4), HIV-2; simian immunodeficiency virus (SIV) among others;

Rotavirus: including VP4, VP5, VP6, VP7, VP8 proteins (Protein ExprPurif. 2004 December; 38(2):205) and/or NSP4;

Pestivirus: such as antigens from classical porcine fever virus, bovineviral diarrhoea virus, and/or border disease virus;

Parvovirus: such as parvovirus B19;

Coronavirus: including SARS virus antigens, particularly spike proteinor proteases therefrom, as well as antigens included in WO 04/92360;

Hepatitis A virus: such as inactivated virus;

Hepatitis B virus: such as the surface and/or core antigens (sAg), aswell as the presurface sequences, pre-S1 and pre-S2 (formerly calledpre-S), as well as combinations of the above, such as sAg/pre-S1,sAg/pre-S2, sAg/pre-S1/pre-S2, and pre-S1/pre-S2, (see, e.g., AHBVVaccines—Human Vaccines and Vaccination, pp. 159-176; and U.S. Pat. Nos.4,722,840, 5,098,704, 5,324,513; Beames et al., J. Virol. (1995)69:6833-6838, Birnbaum et al., J. Virol. (1990) 64:3319-3330; and Zhouet al., J. Virol. (1991) 65:5457-5464);

Hepatitis C virus: such as E1, E2, E1/E2 (see, Houghton et al.,Hepatology (1991) 14:381), NS345 polyprotein, NS 345-core polyprotein,core, and/or peptides from the nonstructural regions (InternationalPublication Nos. WO 89/04669; WO 90/11089; and WO 90/14436);

Delta hepatitis virus (HDV): antigens derived therefrom, particularlyδ-antigen from HDV (see, e.g., U.S. Pat. No. 5,378,814);

Hepatitis E virus (HEV); antigens derived therefrom;

Hepatitis G virus (HGV), antigens derived therefrom;

Varcicella zoster virus: antigens derived from varicella zoster virus(VZV) (J. Gen. Virol. (1986) 67:1759);

Epstein-Barr virus: antigens derived from EBV (Baer et al., Nature(1984) 310:207);

Cytomegalovirus: CMV antigens, including gB and gH (Cytomegaloviruses(J. K. McDougall, ed., Springer-Verlag 1990) pp. 125-169);

Herpes simplex virus: including antigens from HSV-1 or HSV-2 strains andglycoproteins gB, gD and gH (McGeoch et al., J. Gen. Virol.(1988)69:1531 and U.S. Pat. No. 5,171,568);

Human Herpes Virus: antigens derived from other human herpesviruses suchas HHV6 and HHV7; and

HPV: including antigens associated with or derived from humanpapillomavirus (HPV), for example, one or more of E1-E7, L1, L2, andfusions thereof, particularly the compositions of the invention mayinclude a virus-like particle (VLP) comprising the L1 major capsidprotein, more particular still, the HPV antigens are protective againstone or more of HPV serotypes 6, 11, 16 and/or 18.

Further provided are antigens, compostions, methods, and microbesincluded in Vaccines, 4^(th) Edition (Plotkin and Orenstein ed. 2004);Medical Microbiology 4^(th) Edition (Murray et al. ed. 2002); Virology,3rd Edition (W. K. Joklik ed. 1988); Fundamental Virology, 2nd Edition(B. N. Fields and D. M. Knipe, eds. 1991), which are contemplated inconjunction with the compositions of the present invention.

Additionally, antigens include live, attenuated, split, and/or purifiedversions of any of the aforementioned viruses.

Fungal Antigens

Fungal antigens for use herein, associated with vaccines include thosedescribed in: U.S. Pat. Nos. 4,229,434 and 4,368,191 for prophylaxis andtreatment of trichopytosis caused by Trichophyton mentagrophytes; U.S.Pat. Nos. 5,277,904 and 5,284,652 for a broad spectrum dermatophytevaccine for the prophylaxis of dermatophyte infection in animals, suchas guinea pigs, cats, rabbits, horses and lambs, these antigenscomprises a suspension of killed T. equinum, T. mentagrophytes (var.granulare), M. canis and/or M gypseum in an effective amount optionallycombined with an adjuvant; U.S. Pat. Nos. 5,453,273 and 6,132,733 for aringworm vaccine comprising an effective amount of a homogenized,formaldehyde-killed fungi, i.e., Microsporum canis culture in a carrier;U.S. Pat. No. 5,948,413 involving extracellular and intracellularproteins for pythiosis. Additional antigens identified within antifungalvaccines include Ringvac bovis LTF-130 and Bioveta.

Further, fungal antigens for use herein may be derived fromDermatophytres, including: Epidermophyton floccusum, Microsporumaudouini, Microsporum canis, Microsporum distortum, Microsporum equinum,Microsporum gypsum, Microsporum nanum, Trichophyton concentricum,Trichophyton equinum, Trichophyton gallinae, Trichophyton gypseum,Trichophyton megnini, Trichophyton mentagrophytes, Trichophytonquinckeanum, Trichophyton rubrum, Trichophyton schoenleini, Trichophytontonsurans, Trichophyton verrucosum, T. verrucosum var. album, var.discoides, var. ochraceum, Trichophyton violaceum, and/or Trichophytonfaviforme.

Fungal pathogens for use as antigens or in derivation of antigens inconjunction with the compositions of the present invention compriseAspergillus fumigatus, Aspergillus flavus, Aspergillus niger,Aspergillus nidulans, Aspergillus terreus, Aspergillus sydowi,Aspergillus flavatus, Aspergillus glaucus, Blastoschizomyces capitatus,Candida albicans, Candida enolase, Candida tropicalis, Candida glabrata,Candida krusei, Candida parapsilosis, Candida stellatoidea, Candidakusei, Candida parakwsei, Candida lusitaniae, Candida pseudotropicalis,Candida guilliermondi, Cladosporium carrionii, Coccidioides immitis,Blastomyces dermatidis, Cryptococcus neoformans, Geotrichum clavatum,Histoplasma capsulatum, Klebsiella pneumoniae, Paracoccidioidesbrasiliensis, Pneumocystis carinii, Pythiumn insidiosum, Pityrosporumovale, Sacharomyces cerevisae, Saccharomyces boulardii, Saccharomycespombe, Scedosporium apiosperum, Sporothrix schenckii, Trichosporonbeigelii, Toxoplasma gondii, Penicillium marneffei, Malassezia spp.,Fonsecaea spp., Wangiella spp., Sporothrix spp., Basidiobolus spp.,Conidiobolus spp., Rhizopus spp, Mucor spp, Absidia spp, Mortierellaspp, Cunninghamella spp, and Saksenaea spp.

Other fungi from which antigens are derived include Alternaria spp,Curvularia spp, Helminthosporium spp, Fusarium spp, Aspergillus spp,Penicillium spp, Monolinia spp, Rhizoctonia spp, Paecilomyces spp,Pithomyces spp, and Cladosporium spp.

Processes for producing a fungal antigens are well known in the art (seeU.S. Pat. No. 6,333,164). In a preferred method a solubilized fractionextracted and separated from an insoluble fraction obtainable fromfungal cells of which cell wall has been substantially removed or atleast partially removed, characterized in that the process comprises thesteps of: obtaining living fungal cells; obtaining fungal cells of whichcell wall has been substantially removed or at least partially removed;bursting the fungal cells of which cell wall has been substantiallyremoved or at least partially removed; obtaining an insoluble fraction;and extracting and separating a solubilized fraction from the insolublefraction.

STD Antigens

In particular embodiments, microbes (bacteria, viruses and/or fungi)against which the present compositions and methods can be implementinclude those that cause sexually transmitted diseases (STDs) and/orthose that display on their surface an antigen that can be the target orantigen composition of the invention. In a preferred embodiment of theinvention, compositions are combined with antigens derived from a viralor bacterial STD. Antigens derived from bacteria or viruses can beadministered in conjunction with the compositions of the presentinvention to provide protection against at least one of the followingSTDs, among others: chlamydia, genital herpes, hepatitis (particularlyHCV), genital warts, gonorrhoea, syphilis and/or chancroid (See,WO00/15255).

In another embodiment the compositions of the present invention areco-administered with an antigen for the prevention or treatment of anSTD.

Antigens derived from the following viruses associated with STDs, whichare described in greater detail above, are preferred forco-administration with the compositions of the present invention:hepatitis (particularly HCV), HPV, HIV, or HSV.

Additionally, antigens derived from the following bacteria associatedwith STDs, which are described in greater detail above, are preferredfor co-administration with the compositions of the present invention:Neiserria gonorrhoeae, Chlamydia pneumoniae, Chlamydia trachomatis,Treponema pallidum, or Haemophilus ducreyi.

Respiratory Antigens

The antigen may be a respiratory antigen and could further be used in animmunogenic composition for methods of preventing and/or treatinginfection by a respiratory pathogen, including a virus, bacteria, orfungi such as respiratory syncytial virus (RSV), PIV, SARS virus,influenza, Bacillus anthracis, particularly by reducing or preventinginfection and/or one or more symptoms of respiratory virus infection. Acomposition comprising an antigen described herein, such as one derivedfrom a respiratory virus, bacteria or fungus is administered inconjunction with the compositions of the present invention to anindividual which is at risk of being exposed to that particularrespiratory microbe, has been exposed to a respiratory microbe or isinfected with a respiratory virus, bacteria or fungus. Thecomposition(s) of the present invention is/are preferablyco-administered at the same time or in the same formulation with anantigen of the respiratory pathogen. Administration of the compositionresults in reduced incidence and/or severity of one or more symptoms ofrespiratory infection.

Pediatric/Geriatric Antigens

In one embodiment the compositions of the present invention are used inconjunction with an antigen for treatment of a pediatric population, asin a pediatric antigen. In a more particular embodiment the pediatricpopulation is less than about 3 years old, or less than about 2 years,or less than about 1 years old. In another embodiment the pediatricantigen (in conjunction with the composition of the present invention)is administered multiple times over at least 1, 2, or 3 years.

In another embodiment the compositions of the present invention are usedin conjunction with an antigen for treatment of a geriatric population,as in a geriatric antigen.

Other Antigens

Other antigens for use in conjunction with the compositions of thepresent include hospital acquired (nosocomial) associated antigens.

In another embodiment, parasitic antigens are contemplated inconjunction with the compositions of the present invention. Examples ofparasitic antigens include those derived from organisms causing malariaand/or Lyme disease.

In another embodiment, the antigens in conjunction with the compositionsof the present invention are associated with or effective against amosquito bom illness. In another embodiment, the antigens in conjunctionwith the compositions of the present invention are associated with oreffective against encephalitis. In another embodiment the antigens inconjunction with the compositions of the present invention areassociated with or effective against an infection of the nervous system.

In another embodiment, the antigens in conjunction with the compositionsof the present invention are antigens transmissible through blood orbody fluids.

Antigen Formulations

In other aspects of the invention, methods of producing microparticleshaving adsorbed antigens are provided. The methods comprise: (a)providing an emulsion by dispersing a mixture comprising (i) water, (ii)a detergent, (iii) an organic solvent, and (iv) a biodegradable polymerselected from the group consisting of a poly(α-hydroxy acid), apolyhydroxy butyric acid, a polycaprolactone, a polyorthoester, apolyanhydride, and a polycyanoacrylate. The polymer is typically presentin the mixture at a concentration of about 1% to about 30% relative tothe organic solvent, while the detergent is typically present in themixture at a weight-to-weight detergent-to-polymer ratio of from about0.00001:1 to about 0.1:1 (more typically about 0.0001:1 to about 0.1:1,about 0.001:1 to about 0.1:1, or about 0.005:1 to about 0.1:1); (b)removing the organic solvent from the emulsion; and (c) adsorbing anantigen on the surface of the microparticles. In certain embodiments,the biodegradable polymer is present at a concentration of about 3% toabout 10% relative to the organic solvent.

Microparticles for use herein will be formed from materials that aresterilizable, non-toxic and biodegradable. Such materials include,without limitation, poly(α-hydroxy acid), polyhydroxybutyric acid,polycaprolactone, polyorthoester, polyanhydride, PACA, andpolycyanoacrylate. Preferably, microparticles for use with the presentinvention are derived from a poly(α-hydroxy acid), in particular, from apoly(lactide) (“PLA”) or a copolymer of D,L-lactide and glycolide orglycolic acid, such as a poly(D,L-lactide-co-glycolide) (“PLG” or“PLGA”), or a copolymer of D,L-lactide and caprolactone. Themicroparticles may be derived from any of various polymeric startingmaterials which have a variety of molecular weights and, in the case ofthe copolymers such as PLG, a variety of lactide:glycolide ratios, theselection of which will be largely a matter of choice, depending in parton the coadministered macromolecule. These parameters are discussed morefully below.

Further antigens may also include an outer membrane vesicle (OMV)preparation.

Additional formulation methods and antigens (especially tumor antigens)are provided in U.S. patent Ser. No. 09/581,772.

Antigen References

The following references include antigens useful in conjunction with thecompositions of the present invention:

-   1 International patent application WO99/24578-   2 International patent application WO99/36544.-   3 International patent application WO99/57280.-   4 International patent application WO00/22430.-   5 Tettelin et al. (2000) Science 287:1809-1815.-   6 International patent application WO96/29412.-   7 Pizza et al. (2000) Science 287:1816-1820.-   8 PCT WO 01/52885.-   9 Bjune et al. (1991) Lancet 338(8775).-   10 Fuskasawa et al. (1999) Vaccine 17:2951-2958.-   11 Rosenqist et al. (1998) Dev. Biol. Strand 92:323-333.-   12 Constantino et al. (1992) Vaccine 10:691-698.-   13 Constantino et al. (1999) Vaccine 17:1251-1263.-   14 Watson (2000) Pediatr Infect Dis J 19:331-332.-   15 Rubin (20000) Pediatr Clin North Am 47:269-285, v.-   16 Jedrzejas (2001) Microbiol Mol Biol Rev 65:187-207.-   17 International patent application filed on 3^(rd) Jul. 2001    claiming priority from GB-0016363.4; WO 02/02606; PCT IB/01/00166.-   18 Kalman et al. (1999) Nature Genetics 21:385-389.-   19 Read et al. (2000) Nucleic Acids Res 28:1397-406.-   20 Shirai et al. (2000) J. Infect. Dis 181(Suppl 3):S524-S527.-   21 International patent application WO99/27105.-   22 International patent application WO00/27994.-   23 International patent application WO00/37494.-   24 International patent application WO99/28475.-   25 Bell (2000) Pediatr Infect Dis J 19:1187-1188.-   26 Iwarson (1995) APMIS 103:321-326.-   27 Gerlich et al. (1990) Vaccine 8 Suppl:S63-68 & 79-80.-   28 Hsu et al. (1999) Clin Liver Dis 3:901-915.-   29 Gastofsson et al. (1996) N. Engl. J. Med. 334-:349-355.-   30 Rappuoli et al. (1991) TIBTECH 9:232-238.-   31 Vaccines (1988) eds. Plotkin & Mortimer. ISBN 0-7216-1946-0.-   32 Del Guidice et al. (1998) Molecular Aspects of Medicine 19:1-70.-   33 International patent application WO93/018150.-   34 International patent application WO99/53310.-   35 International patent application WO98/04702.-   36 Ross et al. (2001) Vaccine 19:135-142.-   37 Sutter et al. (2000) Pediatr Clin North Am 47:287-308.-   38 Zimmerman & Spann (1999) Am Fan Physician 59:113-118, 125-126.-   39 Dreensen (1997) Vaccine 15 Suppl″ S2-6.-   40 MMWR Morb Mortal Wkly rep 1998 Jan. 16: 47(1):12, 9.-   41 McMichael (2000) Vaccine 19 Suppl 1: S101-107.-   42 Schuchat (1999) Lancer 353(9146):51-6.-   43 GB patent applications 0026333.5, 0028727.6 & 0105640.7.-   44 Dale (1999) Infect Disclin North Am 13:227-43, viii.-   45 Ferretti et al. (2001) PNAS USA 98: 4658-4663.-   46 Kuroda et al. (2001) Lancet 357(9264):1225-1240; see also pages    1218-1219.-   47 Ramsay et al. (2001) Lancet 357(9251): 195-196.-   48 Lindberg (1999) Vaccine 17 Suppl 2:S28-36.-   49 Buttery & Moxon (2000) J R Coil Physicians Long 34:163-168.-   50 Ahmad & Chapnick (1999) Infect Dis Clin North Am 13:113-133, vii.-   51 Goldblatt (1998) J. Med. Microbiol. 47:663-567.-   52 European patent 0 477 508.-   53 U.S. Pat. No. 5,306,492.-   54 International patent application WO98/42721.-   55 Conjugate Vaccines (eds. Cruse et al.) ISBN 3805549326,    particularly vol. 10:48-114.-   56 Hermanson (1996) Bioconjugate Techniques ISBN: 012323368 &    012342335X.-   57 European patent application 0372501.-   58 European patent application 0378881.-   59 European patent application 0427347.-   60 International patent application WO93/17712.-   61 International patent application WO98/58668.-   62 European patent application 0471177.-   63 International patent application WO00/56360.-   64 International patent application WO00/67161.

The contents of all of the above cited patents, patent applications andjournal articles are incorporated by reference as if set forth fullyherein.

There may be an upper limit to the number of Gram positive bacterialproteins which will be in the compositions of the invention. Preferably,the number of Gram positive bacterial proteins in a composition of theinvention is less than 20, less than 19, less than 18, less than 17,less than 16, less than 15, less than 14, less than 13, less than 12,less than 11, less than 10, less than 9, less than 8, less than 7, lessthan 6, less than 5, less than 4, or less than 3. Still more preferably,the number of Gram positive bacterial proteins in a composition of theinvention is less than 6, less than 5, or less than 4. Still morepreferably, the number of Gram positive bacterial proteins in acomposition of the invention is 3.

The Gram positive bacterial proteins and polynucleotides used in theinvention are preferably isolated, i.e., separate and discrete, from thewhole organism with which the molecule is found in nature or, when thepolynucleotide or polypeptide is not found in nature, is sufficientlyfree of other biological macromolecules so that the polynucleotide orpolypeptide can be used for its intended purpose.

Fusion Proteins: GBS AI Sequences

The GBS AI proteins used in the invention may be present in thecomposition as individual separate polypeptides, but it is preferredthat at least two (i.e. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, or 18) of the antigens are expressed as a single polypeptidechain (a “hybrid” or “fusion” polypeptide). Such fusion polypeptidesoffer two principal advantages: first, a polypeptide that may beunstable or poorly expressed on its own can be assisted by adding asuitable fusion partner that overcomes the problem; second, commercialmanufacture is simplified as only one expression and purification needbe employed in order to produce two polypeptides which are bothantigenically useful.

The fusion polypeptide may comprise one or more AI polypeptidesequences. Preferably, the fusion comprises an AI surface proteinsequence. Preferably, the fusion polypeptide includes one or more of GBS80, GBS 104, and GBS 67. Most preferably, the fusion peptide includes apolypeptide sequence from GBS 80. Accordingly, the invention includes afusion peptide comprising a first amino acid sequence and a second aminoacid sequence, wherein said first and second amino acid sequences areselected from a GBS AI surface protein or a fragment thereof.Preferably, the first and second amino acid sequences in the fusionpolypeptide comprise different epitopes.

Hybrids (or fusions) consisting of amino acid sequences from two, three,four, five, six, seven, eight, nine, or ten GBS antigens are preferred.In particular, hybrids consisting of amino acid sequences from two,three, four, or five GBS antigens are preferred.

Different hybrid polypeptides may be mixed together in a singleformulation. Within such combinations, a GBS antigen may be present inmore than one hybrid polypeptide and/or as a non-hybrid polypeptide. Itis preferred, however, that an antigen is present either as a hybrid oras a non-hybrid, but not as both.

Hybrid polypeptides can be represented by the formulaNH₂-A-{-X-L-}_(n)-B—COOH, wherein: X is an amino acid sequence of a GBSAI protein or a fragment thereof; L is an optional linker amino acidsequence; A is an optional N-terminal amino acid sequence; B is anoptional C-terminal amino acid sequence; and n is 2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14 or 15.

If a —X— moiety has a leader peptide sequence in its wild-type form,this may be included or omitted in the hybrid protein. In someembodiments, the leader peptides will be deleted except for that of the—X— moiety located at the N-terminus of the hybrid protein i.e. theleader peptide of X₁ will be retained, but the leader peptides of X₂ . .. X_(n) will be omitted. This is equivalent to deleting all leaderpeptides and using the leader peptide of X₁ as moiety -A-.

For each n instances of {-X-L-}, linker amino acid sequence -L- may bepresent or absent. For instance, when n=2 the hybrid may beNH₂—X₁-L₁-X₂-L₂-COOH, NH₂—X₁—X₂—COOH, NH₂—X₁-L₁-X₂—COOH,NH₂—X₁—X₂-L₂-COOH, etc. Linker amino acid sequence(s) -L- will typicallybe short (e.g. 20 or fewer amino acids i.e. 19, 18, 17, 16, 15, 14, 13,12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples comprise short peptidesequences which facilitate cloning, poly-glycine linkers (i.e.comprising Gly_(n) where n=2, 3, 4, 5, 6, 7, 8, 9, 10 or more), andhistidine tags (i.e. His_(n) where n=3, 4, 5, 6, 7, 8, 9, 10 or more).Other suitable linker amino acid sequences will be apparent to thoseskilled in the art. A useful linker is GSGGGG, with the Gly-Serdipeptide being formed from a BamHI restriction site, thus aidingcloning and manipulation, and the (Gly)₄ tetrapeptide being a typicalpoly-glycine linker.

-A- is an optional N-terminal amino acid sequence. This will typicallybe short (e.g. 40 or fewer amino acids i.e. 39, 38, 37, 36, 35, 34, 33,32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15,14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples include leadersequences to direct protein trafficking, or short peptide sequenceswhich facilitate cloning or purification (e.g. histidine tags i.e.His_(n) where n=3, 4, 5, 6, 7, 8, 9, 10 or more). Other suitableN-terminal amino acid sequences will be apparent to those skilled in theart. If X₁ lacks its own N-terminus methionine, -A- is preferably anoligopeptide (e.g. with 1, 2, 3, 4, 5, 6, 7 or 8 amino acids) whichprovides a N-terminus methionine.

—B— is an optional C-terminal amino acid sequence. This will typicallybe short (e.g. 40 or fewer amino acids i.e. 39, 38, 37, 36, 35, 34, 33,32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15,14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples includesequences to direct protein trafficking, short peptide sequences whichfacilitate cloning or purification (e.g. comprising histidine tags i.e.His_(n), where n=3, 4, 5, 6, 7, 8, 9, 10 or more), or sequences whichenhance protein stability. Other suitable C-terminal amino acidsequences will be apparent to those skilled in the art.

Most preferably, n is 2 or 3.

Fusion Proteins: Gram Positive Bacteria AI Sequences

The Gram positive bacteria AI proteins used in the invention may bepresent in the composition as individual separate polypeptides, but itis preferred that at least two (i.e. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, or 18) of the antigens are expressed as a singlepolypeptide chain (a “hybrid” or “fusion” polypeptide). Such fusionpolypeptides offer two principal advantages: first, a polypeptide thatmay be unstable or poorly expressed on its own can be assisted by addinga suitable fusion partner that overcomes the problem; second, commercialmanufacture is simplified as only one expression and purification needbe employed in order to produce two polypeptides which are bothantigenically useful.

The fusion polypeptide may comprise one or more AI polypeptidesequences. Preferably, the fusion comprises an AI surface proteinsequence. Accordingly, the invention includes a fusion peptidecomprising a first amino acid sequence and a second amino acid sequence,wherein said first and second amino acid sequences are selected from aGram positive bacteria AI protein or a fragment thereof. Preferably, thefirst and second amino acid sequences in the fusion polypeptide comprisedifferent epitopes.

Hybrids (or fusions) consisting of amino acid sequences from two, three,four, five, six, seven, eight, nine, or ten Gram positive bacteriaantigens are preferred. In particular, hybrids consisting of amino acidsequences from two, three, four, or five Gram positive bacteria antigensare preferred.

Different hybrid polypeptides may be mixed together in a singleformulation. Within such combinations, a Gram positive bacteria AIsequence may be present in more than one hybrid polypeptide and/or as anon-hybrid polypeptide. It is preferred, however, that an antigen ispresent either as a hybrid or as a non-hybrid, but not as both.

Hybrid polypeptides can be represented by the formulaNH₂-A-{-X-L-}_(n)-B—COOH, wherein: X is an amino acid sequence of a Grampositive bacteria AI sequence or a fragment thereof; L is an optionallinker amino acid sequence; A is an optional N-terminal amino acidsequence; B is an optional C-terminal amino acid sequence; and n is 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15.

If a —X— moiety has a leader peptide sequence in its wild-type form,this may be included or omitted in the hybrid protein. In someembodiments, the leader peptides will be deleted except for that of the—X— moiety located at the N-terminus of the hybrid protein i.e. theleader peptide of X₁ will be retained, but the leader peptides of X₂ . .. X_(n) will be omitted. This is equivalent to deleting all leaderpeptides and using the leader peptide of X₁ as moiety -A-.

For each n instances of {—X-L-}, linker amino acid sequence -L- may bepresent or absent. For instance, when n=2 the hybrid may beNH₂—X₁-L₁-X₂-L₂-COOH, NH₂—X₁—X₂—COOH, NH₂—X₁-L₁-X₂—COOH,NH₂—X₁—X₂-L₂-COOH, etc. Linker amino acid sequence(s) -L- will typicallybe short (e.g. 20 or fewer amino acids i.e. 19, 18, 17, 16, 15, 14, 13,12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples comprise short peptidesequences which facilitate cloning, poly-glycine linkers (i.e.comprising Gly_(n), where n=2, 3, 4, 5, 6, 7, 8, 9, 10 or more), andhistidine tags (i.e. His_(n) where n=3, 4, 5, 6, 7, 8, 9, 10 or more).Other suitable linker amino acid sequences will be apparent to thoseskilled in the art. A useful linker is GSGGGG, with the Gly-Serdipeptide being formed from a BamHI restriction site, thus aidingcloning and manipulation, and the (Gly)₄ tetrapeptide being a typicalpoly-glycine linker.

-A- is an optional N-terminal amino acid sequence. This will typicallybe short (e.g. 40 or fewer amino acids i.e. 39, 38, 37, 36, 35, 34, 33,32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15,14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples include leadersequences to direct protein trafficking, or short peptide sequenceswhich facilitate cloning or purification (e.g. histidine tags i.e. His,where n=3, 4, 5, 6, 7, 8, 9, 10 or more). Other suitable N-terminalamino acid sequences will be apparent to those skilled in the art. If X₁lacks its own N-terminus methionine, -A- is preferably an oligopeptide(e.g. with 1, 2, 3, 4, 5, 6, 7 or 8 amino acids) which provides aN-terminus methionine.

—B— is an optional C-terminal amino acid sequence. This will typicallybe short (e.g. 40 or fewer amino acids i.e. 39, 38, 37, 36, 35, 34, 33,32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15,14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples includesequences to direct protein trafficking, short peptide sequences whichfacilitate cloning or purification (e.g. comprising histidine tags i.e.His_(n), where n=3, 4, 5, 6, 7, 8, 9, 10 or more), or sequences whichenhance protein stability. Other suitable C-terminal amino acidsequences will be apparent to those skilled in the art.

Most preferably, n is 2 or 3.

Antibodies: GBS AI Sequences

The GBS AI proteins of the invention may also be used to prepareantibodies specific to the GBS AI proteins. The antibodies arepreferably specific to the an oligomeric or hyper-oligomeric form of anAI protein. The invention also includes combinations of antibodiesspecific to GBS AI proteins selected to provide protection against anincreased range of GBS serotypes and strain isolates. For example, acombination may comprise a first and second antibody, wherein said firstantibody is specific to a first GBS AI protein and said second antibodyis specific to a second GBS AI protein. Preferably, the nucleic acidsequence encoding said first GBS AI protein is not present in a GBSgenome comprising a polynucleotide sequence encoding for said second GBSAI protein. Preferably, the nucleic acid sequence encoding said firstand second GBS AI proteins are present in the genomes of multiple GBSserotypes and strain isolates.

The GBS specific antibodies of the invention include one or morebiological moieties that, through chemical or physical means, can bindto or associate with an epitope of a GBS polypeptide. The antibodies ofthe invention include antibodies which specifically bind to a GBS AIprotein. The invention includes antibodies obtained from both polyclonaland monoclonal preparations, as well as the following: hybrid (chimeric)antibody molecules (see, for example, Winter et al. (1991) Nature 349:293-299; and U.S. Pat. No. 4,816,567; F(ab′)₂ and F(ab) fragments; F_(v)molecules (non-covalent heterodimers, see, for example, Inbar et al.(1972) Proc Natl Acad Sci USA 69:2659-2662; and Ehrlich et al. (1980)Biochem 19:4091-4096); single-chain Fv molecules (sFv) (see, forexample, Huston et al. (1988) Proc Natl Acad Sci USA 85:5897-5883);dimeric and trimeric antibody fragment constructs; minibodies (see,e.g., Pack et al. (1992) Biochem 31:1579-1584; Cumber et al. (1992) JImmunology 149B: 120-126); humanized antibody molecules (see, forexample, Riechmann et al. (1988) Nature 332:323-327; Verhoeyan et al.(1988) Science 239:1534-1536; and U.K. Patent Publication No. GB2,276,169, published 21 Sep. 1994); and, any functional fragmentsobtained from such molecules, wherein such fragments retainimmunological binding properties of the parent antibody molecule. Theinvention further includes antibodies obtained through non-conventionalprocesses, such as phage display.

Preferably, the GBS specific antibodies of the invention are monoclonalantibodies. Monoclonal antibodies of the invention include an antibodycomposition having a homogeneous antibody population. Monoclonalantibodies of the invention may be obtained from murine hybridomas, aswell as human monoclonal antibodies obtained using human rather thanmurine hybridomas. See, e.g., Cote, et al. Monoclonal Antibodies andCancer Therapy, Alan R. Liss, 1985, p 77.

The antibodies of the invention may be used in diagnostic applications,for example, to detect the presence or absence of GBS in a biologicalsample. The antibodies of the invention may also be used in theprophylactic or therapeutic treatment of GBS infection.

Antibodies: Gram Positive Bacteria AI Sequences

The Gram positive bacteria AI proteins of the invention may also be usedto prepare antibodies specific to the Gram positive bacteria AIproteins. The antibodies are preferably specific to the an oligomeric orhyper-oligomeric form of an AI protein. The invention also includescombinations of antibodies specific to Gram positive bacteria AIproteins selected to provide protection against an increased range ofGram positive bacteria genus, species, serotypes and strain isolates.

For example, a combination may comprise a first and second antibody,wherein said first antibody is specific to a first Gram positivebacteria AI protein and said second antibody is specific to a secondGram positive bacteria AI protein. Preferably, the nucleic acid sequenceencoding said first Gram positive bacteria AI protein is not present ina Gram positive bacterial genome comprising a polynucleotide sequenceencoding for said second Gram positive bacteria AI protein. Preferably,the nucleic acid sequence encoding said first and second Gram positivebacteria AI proteins are present in the genomes of multiple Grampositive bacteria genus, species, serotypes or strain isolates.

As an example of an instance where the combination of antibodiesprovides protection against an increased range of bacteria serotypes,the first antibody may be specific to a first GAS AI protein and thesecond antibody may be specific to a second GAS AI protein. The firstGAS AI protein may comprise a GAS AI-1 surface protein, while the secondGAS AI protein may comprise a GAS AI-2 or AI-3 surface protein.

As an example of an instance where the combination of antibodiesprovides protection against an increased range of bacterial species, thefirst antibody may be specific to a GBS AI protein and the secondantibody may be specific to a GAS AI protein. Alternatively, the firstantibody may be specific to a GAS AI protein and the second antibody maybe specific to a S. pneumoniae AI protein.

The Gram positive specific antibodies of the invention include one ormore biological moieties that, through chemical or physical means, canbind to or associate with an epitope of a Gram positive bacteria AIpolypeptide. The antibodies of the invention include antibodies whichspecifically bind to a Gram positive bacteria AI protein. The inventionincludes antibodies obtained from both polyclonal and monoclonalpreparations, as well as the following: hybrid (chimeric) antibodymolecules (see, for example, Winter et al. (1991) Nature 349: 293-299;and U.S. Pat. No. 4,816,567; F(ab′)₂ and F(ab) fragments; F_(v)molecules (non-covalent heterodimers, see, for example, Inbar et al.(1972) Proc Natl Acad Sci USA 69:2659-2662; and Ehrlich et al. (1980)Biochem 19:4091_(—)4096); single-chain Fv molecules (sFv) (see, forexample, Huston et al. (1988) Proc Natl Acad Sci USA 85:5897-5883);dimeric and trimeric antibody fragment constructs; minibodies (see,e.g., Pack et al. (1992) Biochem 31:1579-1584; Cumber et al. (1992) JImmunology 149B: 120-126); humanized antibody molecules (see, forexample, Riechmann et al. (1988) Nature 332:323-327; Verhoeyan et al.(1988) Science 239:1534-1536; and U.K. Patent Publication No. GB2,276,169, published 21 Sep. 1994); and, any functional fragmentsobtained from such molecules, wherein such fragments retainimmunological binding properties of the parent antibody molecule. Theinvention further includes antibodies obtained through non-conventionalprocesses, such as phage display.

Preferably, the Gram positive specific antibodies of the invention aremonoclonal antibodies. Monoclonal antibodies of the invention include anantibody composition having a homogeneous antibody population.Monoclonal antibodies of the invention may be obtained from murinehybridomas, as well as human monoclonal antibodies obtained using humanrather than murine hybridomas. See, e.g., Cote, et al. MonoclonalAntibodies and Cancer Therapy, Alan R. Liss, 1985, p 77.

The antibodies of the invention may be used in diagnostic applications,for example, to detect the presence or absence of Gram positive bacteriain a biological sample. The antibodies of the invention may also be usedin the prophylactic or therapeutic treatment of Gram positive bacteriainfection.

Nucleic Acids

The invention provides nucleic acids encoding the Gram positive bacteriasequences and/or the hybrid fusion polypeptides of the invention. Theinvention also provides nucleic acid encoding the GBS antigens and/orthe hybrid fusion polypeptides of the invention. Furthermore, theinvention provides nucleic acid which can hybridise to these nucleicacids, preferably under “high stringency” conditions (e.g. 65° C. in a0.1×SSC, 0.5% SDS solution).

Polypeptides of the invention can be prepared by various means (e.g.recombinant expression, purification from cell culture, chemicalsynthesis, etc.) and in various forms (e.g. native, fusions,non-glycosylated, lipidated, etc.). They are preferably prepared insubstantially pure form (i.e. substantially free from other GAS or hostcell proteins).

Nucleic acid according to the invention can be prepared in many ways(e.g. by chemical synthesis, from genomic or cDNA libraries, from theorganism itself, etc.) and can take various forms (e.g. single stranded,double stranded, vectors, probes, etc.). They are preferably prepared insubstantially pure form (i.e. substantially free from other GBS or hostcell nucleic acids).

The term “nucleic acid” includes DNA and RNA, and also their analogues,such as those containing modified backbones (e.g. phosphorothioates,etc.), and also peptide nucleic acids (PNA), etc. The invention includesnucleic acid comprising sequences complementary to those described above(e.g. for antisense or probing purposes).

The invention also provides a process for producing a polypeptide of theinvention, comprising the step of culturing a host cell transformed withnucleic acid of the invention under conditions which induce polypeptideexpression.

The invention provides a process for producing a polypeptide of theinvention, comprising the step of synthesising at least part of thepolypeptide by chemical means.

The invention provides a process for producing nucleic acid of theinvention, comprising the step of amplifying nucleic acid using aprimer-based amplification method (e.g. PCR).

The invention provides a process for producing nucleic acid of theinvention, comprising the step of synthesising at least part of thenucleic acid by chemical means.

Purification and Recombinant Expression

The Gram positive bacteria AI proteins of the invention may be isolatedfrom the native Gram positive bacteria, or they may be recombinantlyproduced, for instance in a heterologous host. For example, the GAS,GBS, and S. pneumoniae antigens of the invention may be isolated fromStreptococcus agalactiae, S. pyogenes, S. pneumoniae, or they may berecombinantly produced, for instance, in a heterologous host.Preferably, the GBS antigens are prepared using a heterologous host.

The heterologous host may be prokaryotic (e.g. a bacterium) oreukaryotic. It is preferably E. coli, but other suitable hosts includeBacillus subtilis, Vibrio cholerae, Salmonella typhi, Salmonellatyphimurium, Neisseria lactamica, Neisseria cinerea, Mycobacteria (e.g.M. tuberculosis), S. gordonii, L. lactis, yeasts, etc.

Recombinant production of polypeptides is facilitated by adding a tagprotein to the Gram positive bacteria AI sequence to be expressed as afusion protein comprising the tag protein and the Gram positive bacteriaantigen. For example, recombinant production of polypeptides isfacilitated by adding a tag protein to the GBS antigen to be expressedas a fusion protein comprising the tag protein and the GBS antigen. Suchtag proteins can facilitate purification, detection and stability of theexpressed protein. Tag proteins suitable for use in the inventioninclude a polyarginine tag (Arg-tag), polyhistidine tag (His-tag),FLAG-tag, Strep-tag, c-myc-tag, S-tag, calmodulin-binding peptide,cellulose-binding domain, SBP-tag, chitin-binding domain, glutathioneS-transferase-tag (GST), maltose-binding protein, transcriptiontermination anti-terminiantion factor (NusA), E. coli thioredoxin (TrxA)and protein disulfide isomerase I (DsbA). Preferred tag proteins includeHis-tag and GST. A full discussion on the use of tag proteins can befound at Terpe et al., “Overview of tag protein fusions: from molecularand biochemical fundamentals to commercial systems”, Appl MicrobiolBiotechnol (2003) 60:523-533.

After purification, the tag proteins may optionally be removed from theexpressed fusion protein, i.e., by specifically tailored enzymatictreatments known in the art. Commonly used proteases includeenterokinase, tobacco etch virus (TEV), thrombin, and factor X_(a).

GBS Polysaccharides

The compositions of the invention may be further improved by includingGBS polysaccharides. Preferably, the GBS antigen and the saccharide eachcontribute to the immunological response in a recipient. The combinationis particularly advantageous where the saccharide and polypeptideprovide protection from different GBS serotypes.

The combined antigens may be present as a simple combination whereseparate saccharide and polypeptide antigens are administered together,or they may be present as a conjugated combination, where the saccharideand polypeptide antigens are covalently linked to each other.

Thus the invention provides an immunogenic composition comprising (i)one or more GBS AI proteins and (ii) one or more GBS saccharideantigens. The polypeptide and the polysaccharide may advantageously becovalently linked to each other to form a conjugate.

Between them, the combined polypeptide and saccharide antigenspreferably cover (or provide protection from) two or more GBS serotypes(e.g. 2, 3, 4, 5, 6, 7, 8 or more serotypes). The serotypes of thepolypeptide and saccharide antigens may or may not overlap. For example,the polypeptide might protect against serogroup II or V, while thesaccharide protects against either serogroups Ia, Ib, or III. Preferredcombinations protect against the following groups of serotypes: (1)serotypes Ia and Ib, (2) serotypes Ia and H, (3) serotypes Ia and III,(4) serotypes Ia and IV, (5) serotypes Ia and V, (6) serotypes Ia andVI, (7) serotypes Ia and VII, (8) serotypes Ia and VIII, (9) serotypesIb and II, (10) serotypes Ib and III, (11) serotypes Ib and IV, (12)serotypes Ib and V, (13) serotypes Ib and VI, (14) serotypes Ib and VII,(15) serotypes Ib and VIII, 16) serotypes II and III, (17) serotypes IIand IV, (18) serotypes II and V, (19) serotypes II and VI, (20)serotypes H and VII, (21) serotypes II and VII, (22) serotypes III andIV, (23) serotypes III and V, (24) serotypes III and VI, (25) serotypesIII and VII, (26) serotypes III and VIII, (27) serotypes IV and V, (28)serotypes IV and VI, (29) serotypes IV and VII, (30) serotypes IV andVIII, (31) serotypes V and VI, (32) serotypes V and VII, (33) serotypesV and VIII, (34) serotypes VI and VII, (35) serotypes VI and VIII, and(36) serotypes VII and VIII.

Still more preferably, the combinations protect against the followinggroups of serotypes: (1) serotypes Ia and II, (2) serotypes Ia and V;(3) serotypes Ib and II, (4) serotypes Ib and V, (5) serotypes III andII, and (6) serotypes III and V. Most preferably, the combinationsprotect against serotypes III and V.

Protection against serotypes II and V is preferably provided bypolypeptide antigens. Protection against serotypes Ia, Ib and/or III maybe polypeptide or saccharide antigens.

Immunogenic Compositions and Medicaments

Compositions of the invention are preferably immunogenic compositions,and are more preferably vaccine compositions. The pH of the compositionis preferably between 6 and 8, preferably about 7. The pH may bemaintained by the use of a buffer. The composition may be sterile and/orpyrogen-free. The composition may be isotonic with respect to humans.

Vaccines according to the invention may either be prophylactic (i.e. toprevent infection) or therapeutic (i.e. to treat infection), but willtypically be prophylactic. Accordingly, the invention includes a methodfor the therapeutic or prophylactic treatment of a Gram positivebacteria infection in an animal susceptible to such gram positivebacterial infection comprising administering to said animal atherapeutic or prophylactic amount of the immunogenic composition of theinvention. For example, the invention includes a method for thetherapeutic or prophylactic treatment of a Streptococcus agalactiae, S.pyogenes, or S. pneumoniae infection in an animal susceptible tostreptococcal infection comprising administering to said animal atherapeutic or prophylactic amount of the immunogenic compositions ofthe invention.

The invention also provides a composition of the invention for use ofthe compositions described herein as a medicament. The medicament ispreferably able to raise an immune response in a mammal (i.e. it is animmunogenic composition) and is more preferably a vaccine.

The invention also provides the use of the compositions of the inventionin the manufacture of a medicament for raising an immune response in amammal. The medicament is preferably a vaccine.

The invention also provides kits comprising one or more containers ofcompositions of the invention. Compositions can be in liquid form or canbe lyophilized, as can individual antigens. Suitable containers for thecompositions include, for example, bottles, vials, syringes, and testtubes. Containers can be formed from a variety of materials, includingglass or plastic. A container may have a sterile access port (forexample, the container may be an intravenous solution bag or a vialhaving a stopper pierceable by a hypodermic injection needle). Thecomposition may comprise a first component comprising one or more Grampositive bacteria AI proteins. Preferably, the AI proteins are surfaceAI proteins. Preferably, the AI surface proteins are in an oligomeric orhyperoligomeric form. For example, the first component comprises acombination of GBS antigens or GAS antigens, or S. pneumoniae antigens.Preferably said combination includes GBS 80. Preferably GBS 80 ispresent in an oligomeric or hyperoligomeric form.

The kit can further comprise a second container comprising apharmaceutically-acceptable buffer, such as phosphate-buffered saline,Ringer's solution, or dextrose solution. It can also contain othermaterials useful to the end-user, including other buffers, diluents,filters, needles, and syringes. The kit can also comprise a second orthird container with another active agent, for example an antibiotic.

The kit can also comprise a package insert containing writteninstructions for methods of inducing immunity against S agalactiaeand/or S. pyogenes and/or S pneumoniae or for treating S agalactiaeand/or S. pyogenes and/or S pneumoniae infections. The package insertcan be an unapproved draft package insert or can be a package insertapproved by the Food and Drug Administration (FDA) or other regulatorybody.

The invention also provides a delivery device pre-filled with theimmunogenic compositions of the invention.

The invention also provides a method for raising an immune response in amammal comprising the step of administering an effective amount of acomposition of the invention. The immune response is preferablyprotective and preferably involves antibodies and/or cell-mediatedimmunity. This immune response will preferably induce long lasting(e.g., neutralising) antibodies and a cell mediated immunity that canquickly respond upon exposure to one or more GBS and/or GAS and/or S.pneumoniae antigens. The method may raise a booster response.

The invention provides a method of neutralizing GBS, GAS, or S.pneumoniae infection in a mammal comprising the step of administering tothe mammal an effective amount of the immunogenic compositions of theinvention, a vaccine of the invention, or antibodies which recognize animmunogenic composition of the invention.

The mammal is preferably a human. Where the vaccine is for prophylacticuse, the human is preferably a female (either of child bearing age or ateenager). Alternatively, the human may be elderly (e.g., over the ageof 50, 55, 60, 65, 70 or 75) and may have an underlying disease such asdiabetes or cancer. Where the vaccine is for therapeutic use, the humanis preferably a pregnant female or an elderly adult.

These uses and methods are preferably for the prevention and/ortreatment of a disease caused by Streptococcus agalactiae, or S.pyogenes, or S. pneumoniae. The compositions may also be effectiveagainst other streptococcal bacteria. The compositions may also beeffective against other Gram positive bacteria.

One way of checking efficacy of therapeutic treatment involvesmonitoring Gram positive bacterial infection after administration of thecomposition of the invention. One way of checking efficacy ofprophylactic treatment involves monitoring immune responses against theGram positive bacterial antigens in the compositions of the inventionafter administration of the composition.

One way of checking efficacy of therapeutic treatment involvesmonitoring GBS infection after administration of the composition of theinvention. One way of checking efficacy of prophylactic treatmentinvolves monitoring immune responses against the GBS antigens in thecompositions of the invention after administration of the composition.

A way of assessing the immunogenicity of the component proteins of theimmunogenic compositions of the present invention is to express theproteins recombinantly and to screen patient sera or mucosal secretionsby immunoblot. A positive reaction between the protein and the patientserum indicates that the patient has previously mounted an immuneresponse to the protein in question—that is, the protein is animmunogen. This method may also be used to identify immunodominantproteins and/or epitopes.

Another way of checking efficacy of therapeutic treatment involvesmonitoring GBS or GAS or S pneumoniae infection after administration ofthe compositions of the invention. One way of checking efficacy ofprophylactic treatment involves monitoring immune responses bothsystemically (such as monitoring the level of IgG1 and IgG2a production)and mucosally (such as monitoring the level of IgA production) againstthe GBS and/or GAS and/or S pneumoniae antigens in the compositions ofthe invention after administration of the composition. Typically, GBSand/or GAS and/or S pneumoniae serum specific antibody responses aredetermined post-immunization but pre-challenge whereas mucosal GBSand/or GAS and/or S pneumoniae specific antibody body responses aredetermined post-immunization and post-challenge.

The vaccine compositions of the present invention can be evaluated in invitro and in vivo animal models prior to host, e.g., human,administration.

The efficacy of immunogenic compositions of the invention can also bedetermined in vivo by challenging animal models of GBS and/or GAS and/orS pneumoniae infection, e.g., guinea pigs or mice, with the immunogeniccompositions. The immunogenic compositions may or may not be derivedfrom the same serotypes as the challenge serotypes. Preferably theimmunnogenic compositions are derivable from the same serotypes as thechallenge serotypes. More preferably, the immunogenic composition and/orthe challenge serotypes are derivable from the group of GBS and/or GASand/or S pneumoniae serotypes.

In vivo efficacy models include but are not limited to: (i) A murineinfection model using human GBS and/or GAS and/or S pneumoniaeserotypes; (ii) a murine disease model which is a murine model using amouse-adapted GBS and/or GAS and/or S pneumoniae strain, such as thosestrains outlined above which is particularly virulent in mice and (iii)a primate model using human GBS or GAS or S pneumoniae isolates.

The immune response may be one or both of a TH1 immune response and aTH2 response.

The immune response may be an improved or an enhanced or an alteredimmune response.

The immune response may be one or both of a systemic and a mucosalimmune response.

Preferably the immune response is an enhanced system and/or mucosalresponse.

An enhanced systemic and/or mucosal immunity is reflected in an enhancedTH1 and/or TH2 immune response. Preferably, the enhanced immune responseincludes an increase in the production of IgG1 and/or IgG2a and/or IgAPreferably the mucosal immune response is a TH2 immune response.Preferably, the mucosal immune response includes an increase in theproduction of IgA.

Activated TH2 cells enhance antibody production and are therefore ofvalue in responding to extracellular infections. Activated TH2 cells maysecrete one or more of IL-4, IL-5, IL-6, and IL-10. A TH2 immuneresponse may result in the production of IgG1, IgE, IgA and memory Bcells for future protection.

A TH2 immune response may include one or more of an increase in one ormore of the cytokines associated with a TH2 immune response (such asIL-4, IL-5, IL-6 and IL-10), or an increase in the production of IgG1,IgE, IgA and memory B cells. Preferably, the enhanced TH2 immune resonsewill include an increase in IgG1 production.

A TH1 immune response may include one or more of an increase in CTLs, anincrease in one or more of the cytokines associated with a TH1 immuneresponse (such as IL-2, IFNγ, and TNFβ, an increase in activatedmacrophages, an increase in NK activity, or an increase in theproduction of IgG2a. Preferably, the enhanced TH1 immune response willinclude an increase in IgG2a production.

Immunogenic compositions of the invention, in particular, immunogeniccomposition comprising one or more GAS antigens of the present inventionmay be used either alone or in combination with other GAS antigensoptionally with an immunoregulatory agent capable of eliciting a Th1and/or Th2 response.

Compositions of the invention will generally be administered directly toa patient. Certain routes may be favored for certain compositons, asresulting in the generation of a more effective immune response,preferably a CMI response, or as being less likely to induce sideeffects, or as being easier for administration. Direct delivery may beaccomplished by parenteral injection (e.g. subcutaneously,intraperitoneally, intradermally, intravenously, intramuscularly, or tothe interstitial space of a tissue), or by rectal, oral (e.g. tablet,spray), vaginal, topical, transdermal (e.g. see WO 99/27961) ortranscutaneous (e.g. see WO 02/074244 and WO 02/064162), intranasal(e.g. see WO03/028760), ocular, aural, pulmonary or other mucosaladministration.

The invention may be used to elicit systemic and/or mucosal immunity.

In one particularly preferred embodiment, the immunogenic compositioncomprises one or more GBS or GAS or S pneumoniae antigen(s) whichelicits a neutralising antibody response and one or more GBS or GAS or Spneumoniae antigen(s) which elicit a cell mediated immune response. Inthis way, the neutralising antibody response prevents or inhibits aninitial GBS or GAS or S pneumoniae infection while the cell-mediatedimmune response capable of eliciting an enhanced Th1 cellular responseprevents further spreading of the GBS or GAS or S pneumoniae infection.Preferably, the immunogenic composition comprises one or more GBS or GASor S pneumoniae surface antigens and one or more GBS or GAS or Spneumoniae cytoplasmic antigens. Preferably the immunogenic compositioncomprises one or more GBS or GAS or S pneumoniae surface antigens or thelike and one or other antigens, such as a cytoplasmic antigen capable ofeliciting a Th1 cellular response.

Dosage treatment can be a single dose schedule or a multiple doseschedule. Multiple doses may be used in a primary immunisation scheduleand/or in a booster immunisation schedule. In a multiple dose schedulethe various doses may be given by the same or different routes e.g. aparenteral prime and mucosal boost, a mucosal prime and parenteralboost, etc.

The compositions of the invention may be prepared in various forms. Forexample, the compositions may be prepared as injectables, either asliquid solutions or suspensions. Solid forms suitable for solution in,or suspension in, liquid vehicles prior to injection can also beprepared (e.g. a lyophilised composition). The composition may beprepared for topical administration e.g. as an ointment, cream orpowder. The composition may be prepared for oral administration e.g. asa tablet or capsule, as a spray, or as a syrup (optionally flavoured).The composition may be prepared for pulmonary administration e.g. as aninhaler, using a fine powder or a spray. The composition may be preparedas a suppository or pessary. The composition may be prepared for nasal,aural or ocular administration e.g. as drops. The composition may be inkit form, designed such that a combined composition is reconstitutedjust prior to administration to a patient. Such kits may comprise one ormore antigens in liquid form and one or more lyophilised antigens.

Immunogenic compositions used as vaccines comprise an immunologicallyeffective amount of antigen(s), as well as any other components, such asantibiotics, as needed. By ‘immunologically effective amount’, it ismeant that the administration of that amount to an individual, either ina single dose or as part of a series, is effective for treatment orprevention, or increases a measurable immune response or prevents orreduces a clinical symptom. This amount varies depending upon the healthand physical condition of the individual to be treated, age, thetaxonomic group of individual to be treated (e.g. non-human primate,primate, etc.), the capacity of the individual's immune system tosynthesise antibodies, the degree of protection desired, the formulationof the vaccine, the treating doctor's assessment of the medicalsituation, and other relevant factors. It is expected that the amountwill fall in a relatively broad range that can be determined throughroutine trials.

Further Components of the Composition

The composition of the invention will typically, in addition to thecomponents mentioned above, comprise one or more ‘pharmaceuticallyacceptable carriers’, which include any carrier that does not itselfinduce the production of antibodies harmful to the individual receivingthe composition. Suitable carriers are typically large, slowlymetabolised macromolecules such as proteins, polysaccharides, polylacticacids, polyglycolic acids, polymeric amino acids, amino acid copolymers,and lipid aggregates (such as oil droplets or liposomes). Such carriersare well known to those of ordinary skill in the art. The vaccines mayalso contain diluents, such as water, saline, glycerol, etc.Additionally, auxiliary substances, such as wetting or emulsifyingagents, pH buffering substances, and the like, may be present. Athorough discussion of pharmaceutically acceptable excipients isavailable in Gennaro (2000) Remington: The Science and Practice ofPharmacy. 20th ed., ISBN: 0683306472.

Adjuvants

Vaccines of the invention may be administered in conjunction with otherimmunoregulatory agents. In particular, compositions will usuallyinclude an adjuvant. Adjuvants for use with the invention include, butare not limited to, one or more of the following set forth below:

A. Mineral Containing Compositions

Mineral containing compositions suitable for use as adjuvants in theinvention include mineral salts, such as aluminum salts and calciumsalts. The invention includes mineral salts such as hydroxides (e.g.oxyhydroxides), phosphates (e.g. hydroxyphosphates, orthophosphates),sulfates, etc. (e.g. see chapters 8 & 9 of Vaccine Design . . . (1995)eds. Powell & Newman. ISBN: 030644867X. Plenum.), or mixtures ofdifferent mineral compounds (e.g. a mixture of a phosphate and ahydroxide adjuvant, optionally with an excess of the phosphate), withthe compounds taking any suitable form (e.g. gel, crystalline,amorphous, etc.), and with adsorption to the salt(s) being preferred.The mineral containing compositions may also be formulated as a particleof metal salt (WO 00/23105).

Aluminum salts may be included in vaccines of the invention such thatthe dose of Al³⁺ is between 0.2 and 1.0 mg per dose.

B. Oil-Emulsions

Oil-emulsion compositions suitable for use as adjuvants in the inventioninclude squalene-water emulsions, such as MF59 (5% Squalene, 0.5% Tween80, and 0.5% Span 85, formulated into submicron particles using amicrofluidizer). See WO90/14837. See also, Podda, “The adjuvantedinfluenza vaccines with novel adjuvants: experience with theMF59-adjuvanted vaccine”, Vaccine (2001) 19: 2673-2680; Frey et al.,“Comparison of the safety, tolerability, and immunogenicity of aMF59-adjuvanted influenza vaccine and a non-adjuvanted influenza vaccinein non-elderly adults”, Vaccine (2003) 21:42344237. MF59 is used as theadjuvant in the FLUAD™ influenza virus trivalent subunit vaccine.

Particularly preferred adjuvants for use in the compositions aresubmicron oil-in-water emulsions. Preferred submicron oil-in-wateremulsions for use herein are squalene/water emulsions optionallycontaining varying amounts of MTP-PE, such as a submicron oil-in-wateremulsion containing 4-5% w/v squalene, 0.25-1.0% w/v Tween 80™(polyoxyelthylenesorbitan monooleate), and/or 0.25-1.0% Span85™(sorbitan trioleate), and, optionally,N-acetylmuramyl-L-alanyl-D-isogluatminyl-L-alanine-2-(1′-2′-dipalmitoyl-sn-glycero-3-huydroxyphosphophoryloxy)-ethylamine(MTP-PE), for example, the submicron oil-in-water emulsion known as“MF59” (International Publication No. WO 90/14837; U.S. Pat. Nos.6,299,884 and 6,451,325, incorporated herein by reference in theirentireties; and Ott et al., “MF59—Design and Evaluation of a Safe andPotent Adjuvant for Human Vaccines” in Vaccine Design: The Subunit andAdjuvant Approach (Powell, M. F. and Newman, M. J. eds.) Plenum Press,New York, 1995, pp. 277-296). MF59 contains 4-5% w/v Squalene (e.g.4.3%), 0.25-0.5% w/v Tween 80™, and 0.5% w/v Span 85™ and optionallycontains various amounts of MTP-PE, formulated into submicron particlesusing a microfluidizer such as Model 110Y microfluidizer (Microfluidics,Newton, Mass.). For example, MTP-PE may be present in an amount of about0-500 μg/dose, more preferably 0-250 μg/dose and most preferably, 0-100μg/dose. As used herein, the term “MF59-0” refers to the above submicronoil-in-water emulsion lacking MTP-PE, while the term MF59-MTP denotes aformulation that contains MTP-PE. For instance, “MF59-100” contains 100μg MTP-PE per dose, and so on. MF69, another submicron oil-in-wateremulsion for use herein, contains 4.3% w/v squalene, 0.25% w/v Tween80™, and 0.75% w/v Span 85™ and optionally MTP-PE. Yet another submicronoil-in-water emulsion is MF75, also known as SAF, containing 10%squalene, 0.4% Tween 80™, 5% pluronic-blocked polymer L121, and thr-MDP,also microfluidized into a submicron emulsion. MF75-MTP denotes an MF75formulation that includes MTP, such as from 100-400 μg MTP-PE per dose.

Submicron oil-in-water emulsions, methods of making the same andimmunostimulating agents, such as muramyl peptides, for use in thecompositions, are described in detail in International Publication No.WO 90/14837 and U.S. Pat. Nos. 6,299,884 and 6,451,325, incorporatedherein by reference in their entireties.

Complete Freund's adjuvant (CFA) and incomplete Freund's adjuvant (IFA)may also be used as adjuvants in the invention.

C. Saponin Formulations

Saponin formulations, may also be used as adjuvants in the invention.Saponins are a heterologous group of sterol glycosides and triterpenoidglycosides that are found in the bark, leaves, stems, roots and evenflowers of a wide range of plant species. Saponin from the bark of theQuillaia saponaria Molina tree have been widely studied as adjuvants.Saponin can also be commercially obtained from Smilax ornata(sarsaprilla), Gypsophilla paniculata (brides veil), and Saponariaofficianalis (soap root). Saponin adjuvant formulations include purifiedformulations, such as QS21, as well as lipid formulations, such asISCOMs.

Saponin compositions have been purified using High Performance ThinLayer Chromatography (HP-LC) and Reversed Phase High Performance LiquidChromatography (RP-HPLC). Specific purified fractions using thesetechniques have been identified, including QS7, QS17, QS18, QS21, QH-A,QH-B and QH-C. Preferably, the saponin is QS21. A method of productionof QS21 is disclosed in U.S. Pat. No. 5,057,540. Saponin formulationsmay also comprise a sterol, such as cholesterol (see WO96/33739).

Combinations of saponins and cholesterols can be used to form uniqueparticles called Immunostimulating Complexs (ISCOMs). ISCOMs typicallyalso include a phospholipid such as phosphatidylethanolamine orphosphatidylcholine. Any known saponin can be used in ISCOMs.Preferably, the ISCOM includes one or more of Quil A, QHA and QHC.ISCOMs are further described in EP0109942, WO 96/11711 and WO 96/33739.Optionally, the ISCOMS may be devoid of additional detergent. See WO00/07621.

A review of the development of saponin based adjuvants can be found atBarr, et al., “ISCOMs and other saponin based adjuvants”, Advanced DrugDelivery Reviews (1998) 32:247-271. See also Sjolander, et al., “Uptakeand adjuvant activity of orally delivered saponin and ISCOM vaccines”,Advanced Drug Delivery Reviews (1998) 32:321-338.

D. Virosomes and Virus Like Particles (VLPs)

Virosomes and Virus Like Particles (VLPs) can also be used as adjuvantsin the invention. These structures generally contain one or moreproteins from a virus optionally combined or formulated with aphospholipid. They are generally non-pathogenic, non-replicating andgenerally do not contain any of the native viral genome. The viralproteins may be recombinantly produced or isolated from whole viruses.These viral proteins suitable for use in virosomes or VLPs includeproteins derived from influenza virus (such as HA or NA), Hepatitis Bvirus (such as core or capsid proteins), Hepatitis E virus, measlesvirus, Sindbis virus, Rotavirus, Foot-and-Mouth Disease virus,Retrovirus, Norwalk virus, human Papilloma virus, HIV, RNA-phages,Qβ-phage (such as coat proteins), GA-phage, fr-phage, AP205 phage, andTy (such as retrotransposon Ty protein p1). VLPs are discussed furtherin WO 03/024480, WO 03/024481, and Niikura et al., “Chimeric RecombinantHepatitis E Virus-Like Particles as an Oral Vaccine Vehicle PresentingForeign Epitopes”, Virology (2002) 293:273-280; Lenz et al.,“Papillomarivurs-Like Particles Induce Acute Activation of DendriticCells”, Journal of Immunology (2001) 5246-5355; Pinto, et al., “CellularImmune Responses to Human Papillomavirus (HPV)-16 L1 Healthy VolunteersImmunized with Recombinant HPV-16 L1 Virus-Like Particles”, Journal ofInfectious Diseases (2003) 188:327-338; and Gerber et al., “HumanPapillomavrisu Virus-Like Particles Are Efficient Oral Immunogens whenCoadministered with Escherichia coli Heat-Labile Entertoxin Mutant R192Gor CpG”, Journal of Virology (2001) 75(10):4752-4760. Virosomes arediscussed further in, for example, Gluck et al., “New TechnologyPlatforms in the Development of Vaccines for the Future”, Vaccine (2002)20:B10-B16. Immunopotentiating reconstituted influenza virosomes (IRIV)are used as the subunit antigen delivery system in the intranasaltrivalent IFLEXAL™ product {Mischler & Metcalfe (2002) Vaccine 20 Suppl5:B17-23} and the INFLUVAC PLUS™ product.

E. Bacterial or Microbial Derivatives

Adjuvants suitable for use in the invention include bacterial ormicrobial derivatives such as:

(1) Non-Toxic Derivatives of Enterobacterial Lipopolysaccharide (LPS)

Such derivatives include Monophosphoryl lipid A (MPL) and 3-O-deacylatedMPL (3dMPL). 3dMPL is a mixture of 3 De-O-acylated monophosphoryl lipidA with 4, 5 or 6 acylated chains. A preferred “small particle” form of 3De-O-acylated monophosphoryl lipid A is disclosed in EP 0 689 454. Such“small particles” of 3dMPL are small enough to be sterile filteredthrough a 0.22 micron membrane (see EP 0 689 454). Other non-toxic LPSderivatives include monophosphoryl lipid A mimics, such as aminoalkylglucosaminide phosphate derivatives e.g. RC-529. See Johnson et al.(1999) Bioorg Med Chem Lett 9:2273-2278.

(2) Lipid A Derivatives

Lipid A derivatives include derivatives of lipid A from Escherichia colisuch as OM-174. OM-174 is described for example in Meraldi et al.,“OM-174, a New Adjuvant with a Potential for Human Use, Induces aProtective Response with Administered with the Synthetic C-TerminalFragment 242-310 from the circumsporozoite protein of Plasmodiumberghei”, Vaccine (2003) 21:2485-2491; and Pajak, et al., “The AdjuvantOM-174 induces both the migration and maturation of murine dendriticcells in vivo”, Vaccine (2003) 21:836-842.

(3) Immunostimulatory Oligonucleotides

Immunostimulatory oligonucleotides suitable for use as adjuvants in theinvention include nucleotide sequences containing a CpG motif (asequence containing an unmethylated cytosine followed by guanosine andlinked by a phosphate bond). Bacterial double stranded RNA oroligonucleotides containing palindromic or poly(dG) sequences have alsobeen shown to be immunostimulatory.

The CpG's can include nucleotide modifications/analogs such asphosphorothioate modifications and can be double-stranded orsingle-stranded. Optionally, the guanosine may be replaced with ananalog such as 2′-deoxy-7-deazaguanosine. See Kandimalla, et al.,“Divergent synthetic nucleotide motif recognition pattern: design anddevelopment of potent immunomodulatory oligodeoxyribonucleotide agentswith distinct cytokine induction profiles”, Nucleic Acids Research(2003) 31(9): 2393-2400; WO02/26757 and WO99/62923 for examples ofpossible analog substitutions. The adjuvant effect of CpGoligonucleotides is further discussed in Krieg, “CpG motifs: the activeingredient in bacterial extracts?”, Nature Medicine (2003) 9(7):831-835; McCluskie, et al., “Parenteral and mucosal prime-boostimmunization strategies in mice with hepatitis B surface antigen and CpGDNA”, FEMS Immunology and Medical Microbiology (2002) 32:179-185;WO98/40100; U.S. Pat. No. 6,207,646; U.S. Pat. No. 6,239,116 and U.S.Pat. No. 6,429,199.

The CpG sequence may be directed to TLR9, such as the motif GTCGTT orTTCGTT. See Kandimalla, et al., “Toll-like receptor 9: modulation ofrecognition and cytokine induction by novel synthetic CpG DNAs”,Biochemical Society Transactions (2003) 31 (part 3): 654-658. The CpGsequence may be specific for inducing a Th1 immune response, such as aCpG-A ODN, or it may be more specific for inducing a B cell response,such a CpG-B ODN. CpG-A and CpG-B ODNs are discussed in Blackwell, etal., “CpG-A-Induced Monocyte IFN-gamma-Inducible Protein-10 Productionis Regulated by Plasmacytoid Dendritic Cell Derived IFN-alpha”, J.Immunol. (2003) 170(8):4061-4068; Krieg, “From A to Z on CpG”, TRENDS inImmunology (2002) 23(2): 64-65 and WO01/95935. Preferably, the CpG is aCpG-A ODN.

Preferably, the CpG oligonucleotide is constructed so that the 5′ end isaccessible for receptor recognition. Optionally, two CpG oligonucleotidesequences may be attached at their 3′ ends to form “immunomers”. See,for example, Kandimalla, et al., “Secondary structures in CpGoligonucleotides affect immunostimulatory activity”, BBRC (2003)306:948-953; Kandimalla, et al., “Toll-like receptor 9: modulation ofrecognition and cytokine induction by novel synthetic GpG DNAs”,Biochemical Society Transactions (2003) 31(part 3):664-658; Bhagat etal., “CpG penta- and hexadeoxyribonucleotides as potent immunomodulatoryagents” BBRC (2003) 300:853-861 and WO 03/035836.

(4) ADP-Ribosylating Toxins and Detoxified Derivatives Thereof.

Bacterial ADP-ribosylating toxins and detoxified derivatives thereof maybe used as adjuvants in the invention. Preferably, the protein isderived from E. coli (i.e., E. coli heat labile enterotoxin “LT),cholera (“CT”), or pertussis (“PT”). The use of detoxifiedADP-ribosylating toxins as mucosal adjuvants is described in WO95/17211and as parenteral adjuvants in WO98/42375. Preferably, the adjuvant is adetoxified LT mutant such as LT-K63, LT-R72, and LTR192G. The use ofADP-ribosylating toxins and detoxified derivaties thereof, particularlyLT-K63 and LT-R72, as adjuvants can be found in the followingreferences, each of which is specifically incorporated by referenceherein in their entirety: Beignon, et al., “The LTR72 Mutant ofHeat-Labile Enterotoxin of Escherichia coli Enahnces the Ability ofPeptide Antigens to Elicit CD4+ T Cells and Secrete Gamma Interferonafter Coapplication onto Bare Skin”, Infection and Immunity (2002)70(6):3012-3019; Pizza, et al., “Mucosal vaccines: non toxic derivativesof LT and CT as mucosal adjuvants”, Vaccine (2001) 19:2534-2541; Pizza,et al., “LTK63 and LTR72, two mucosal adjuvants ready for clinicaltrials” Int. J. Med. Microbiol (2000) 290(4-5):455-461; Scharton-Kerstenet al., “Transcutaneous Immunization with Bacterial ADP-RibosylatingExotoxins, Subunits and Unrelated Adjuvants”, Infection and Immunity(2000) 68(9):5306-5313; Ryan et al., “Mutants of Escherichia coliHeat-Labile Toxin Act as Effective Mucosal Adjuvants for Nasal Deliveryof an Acellular Pertussis Vaccine: Differential Effects of the NontoxicAB Complex and Enzyme Activity on Th1 and Th2 Cells” Infection andImmunity (1999) 67(12):6270-6280; Partidos et al., “Heat-labileenterotoxin of Escherichia coli and its site-directed mutant LTK63enhance the proliferative and cytotoxic T-cell responses to intranasallyco-immunized synthetic peptides”, Immunol. Lett. (1999) 67(3):209-216;Peppoloni et al., “Mutants of the Escherichia coli heat-labileenterotoxin as safe and strong adjuvants for intranasal delivery ofvaccines”, Vaccines (2003) 2(2):285-293; and Pine et al., (2002)“Intranasal immunization with influenza vaccine and a detoxified mutantof heat labile enterotoxin from Escherichia coli (LTK63)” J. ControlRelease (2002) 85(1-3):263-270. Numerical reference for amino acidsubstitutions is preferably based on the alignments of the A and Bsubunits of ADP-ribosylating toxins set forth in Domenighini et al.,Mol. Microbiol (1995) 15(6):1165-1167, specifically incorporated hereinby reference in its entirety.

F. Bioadhesives and Mucoadhesives

Bioadhesives and mucoadhesives may also be used as adjuvants in theinvention. Suitable bioadhesives include esterified hyaluronic acidmicrospheres (Singh et al. (2001) J. Cont. Rele. 70:267-276) ormucoadhesives such as cross-linked derivatives of poly(acrylic acid),polyvinyl alcohol, polyvinyl pyrollidone, polysaccharides andcarboxymethylcellulose. Chitosan and derivatives thereof may also beused as adjuvants in the invention. E.g. WO99/27960.

G. Microparticles

Microparticles may also be used as adjuvants in the invention.Microparticles (i.e. a particle of ˜100 nm to ˜150 μm in diameter, morepreferably ˜200 nm to ˜30 μm in diameter, and most preferably ˜500 nm to˜10 μm in diameter) formed from materials that are biodegradable andnon-toxic (e.g. a poly(α-hydroxy acid), a polyhydroxybutyric acid, apolyorthoester, a polyanhydride, a polycaprolactone, etc.), withpoly(lactide-co-glycolide) are preferred, optionally treated to have anegatively-charged surface (e.g. with SDS) or a positively-chargedsurface (e.g. with a cationic detergent, such as CTAB).

H. Liposomes

Examples of liposome formulations suitable for use as adjuvants aredescribed in U.S. Pat. No. 6,090,406, U.S. Pat. No. 5,916,588, and EP 0626 169.

I. Polyoxyethylene Ether and Polyoxyethylene Ester Formulations

Adjuvants suitable for use in the invention include polyoxyethyleneethers and polyoxyethylene esters. WO99/52549. Such formulations furtherinclude polyoxyethylene sorbitan ester surfactants in combination withan octoxynol (WO01/21207) as well as polyoxyethylene alkyl ethers orester surfactants in combination with at least one additional non-ionicsurfactant such as an octoxynol (WO 01/21152).

Preferred polyoxyethylene ethers are selected from the following group:polyoxyethylene-9-lauryl ether (laureth 9), polyoxyethylene-9-steorylether, polyoxytheylene-8-steoryl ether, polyoxyethylene-4-lauryl ether,polyoxyethylene-35-lauryl ether, and polyoxyethylene-23-lauryl ether.

J. Polyphosphazene (PCPP)

PCPP formulations are described, for example, in Andrianov et al.,“Preparation of hydrogel microspheres by coacervation of aqueouspolyphophazene solutions”, Biomaterials (1998) 19(1-3):109-115 and Payneet al., “Protein Release from Polyphosphazene Matrices”, Adv. Drug.Delivery Review (1998) 31(3):185-196.

K. Muramyl Peptides

Examples of muramyl peptides suitable for use as adjuvants in theinvention include N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP),N-acetyl-normuramyl-1-alanyl-d-isoglutamine (nor-MDP), andN-acetylmuramyl-1-alanyl-d-isoglutaminyl-1-alanine-2-(1′-2′-dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy)-ethylamineMTP-PE).

L. Imidazoquinolone Compounds.

Examples of imidazoquinolone compounds suitable for use adjuvants in theinvention include Imiquamod and its homologues, described further inStanley, “Imiquimod and the imidazoquinolones: mechanism of action andtherapeutic potential” Clin Exp Dermatol (2002) 27(7):571-577 and Jones,“Resiquimod 3M”, Curr Opin Investig Drugs (2003) 4(2):214-218.

The invention may also comprise combinations of aspects of one or moreof the adjuvants identified above. For example, the following adjuvantcompositions may be used in the invention:

(1) a saponin and an oil-in-water emulsion (WO 99/11241);

(2) a saponin (e.g., QS21)+a non-toxic LPS derivative (e.g. 3dMPL) (seeWO 94/00153);

(3) a saponin (e.g., QS21)+a non-toxic LPS derivative (e.g. 3dMPL)+acholesterol;

(4) a saponin (e.g. QS21)+3dMPL+IL-12 (optionally+a sterol) (WO98/57659);

(5) combinations of 3dMPL with, for example, QS21 and/or oil-in-wateremulsions (See European patent applications 0835318, 0735898 and0761231);

(6) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blockpolymer L121, and thr-MDP, either microfluidized into a submicronemulsion or vortexed to generate a larger particle size emulsion.

(7) Ribi™ adjuvant system (RAS), (Ribi Immunochem) containing 2%Squalene, 0.2% Tween 80, and one or more bacterial cell wall componentsfrom the group consisting of monophosphorylipid A (MPL), trehalosedimycolate (TDM), and cell wall skeleton (CWS), preferably MPL+CWS(Detox™);

(8) one or more mineral salts (such as an aluminum salt)+a non-toxicderivative of LPS (such as 3dPML).

(9) one or more mineral salts (such as an aluminum salt)+animmunostimulatory oligonucleotide (such as a nucleotide sequenceincluding a CpG motif). Combination No. (9) is a preferred adjuvantcombination.

M. Human Immunomodulators

Human immunomodulators suitable for use as adjuvants in the inventioninclude cytokines, such as interleukins (e.g. IL-1, IL-2, IL-4, IL-5,1L-6, IL-7, IL-12, etc.), interferons (e.g. interferon-γ), macrophagecolony stimulating factor, and tumor necrosis factor.

Aluminum salts and MF59 are preferred adjuvants for use with injectableinfluenza vaccines. Bacterial toxins and bioadhesives are preferredadjuvants for use with mucosally-delivered vaccines, such as nasalvaccines.

The immunogenic compositions of the present invention may be administedin combination with an antibiotic treatment regime. In one embodiment,the antibiotic is administered prior to administration of the antigen ofthe invention or the composition comprising the one or more of theantigens of the invention.

In another embodiment, the antibiotic is administered subsequent to theadminstration of the one or more antigens of the invention or thecomposition comprising the one or more antigens of the invention.Examples of antibiotics suitable for use in the treatment of theSteptococcal infections of the invention include but are not limited topenicillin or a derivative thereof or clindamycin or the like.

Further Antigens

The compositions of the invention may further comprise one or moreadditional Gram positive bacterial antigens which are not associatedwith an AI. Preferably, the Gram positive bacterial antigens that arenot associated with an AI can provide protection across more than oneserotype or strain isolate. For example, a first non-AI antigen, inwhich the first non-AI antigen is at least 90% (i.e., at least 90, 91,92, 93, 94, 95, 96, 97, 98, 99, or 100%) homologous to the amino acidsequence of a second non-AI antigen, wherein the first and the secondnon-AI antigen are derived from the genomes of different serotypes of aGram positive bacteria, may be further included in the compositions. Thefirst non-AI antigen may also be homologous to the amino acid sequenceof a third non-AI antigen, such that the first non-AI antigen, thesecond non-AI antigen, and the third non-AI antigen are derived from thegenomes of different serotypes of a Gram positive bacteria. The firstnon-AI antigen may also be homologous to the amino acid sequence of afourth non-AI antigen, such that the first non-AI antigen, the secondnon-AI antigen, the third non-AI antigen, and the fourth non-AI antigenare derived from the genomes of different serotypes of a Gram positivebacteria.

The first non-AI antigen may be GBS 322. The amino acid sequence of GBS322 across GBS strains from serotypes Ia, Ib, II, III, V, and VIII isgreater than 90%. Alternatively, the first non-AI antigen may be GBS276. The amino acid sequence of GBS 276 across GBS strain from serotypesIa, Ib, II, III, V, and VIII is greater than 90%. Table 13 provides thepercent amino acid sequence identity of GBS 322 and GBS 276 acrossdifferent GBS strains and serotypes. TABLE 13 Conservation of GBS 322and GBS 276 amino acid sequences GBS 322 GBS 276 Serotype Strains cGH %AA identity cGH % AA identity Ia 090 + 98.60 + 97.90 A909 + 98.30 +97.90 515 + 98.80 + 97.50 DK1 + + DK8 + + Davis + + Ib 7357b + + H36B +98.30 + 97.80 II 18RS21 + 100.00 + 99.90 DK21 + + III NEM316 + 100.00 +97.00 COH31 + + D136 + + M732 + 98.00 + 100.00 COH1 + 98.30 + 100.00M781 + 98.30 + 99.60 No type CJB110 + 98.60 + 97.90 1169NT + 97.40 +97.90 V CJB111 + 100.00 + 2603 + 100.00 + 100.00 VIII JM130013 +100.00 + 97.90 SMU014 + + total 22/22 98.28 +/− 0.4 22/22 98.44 +/−1.094

As an example, inclusion of a non-AI protein, GBS 322, in combinationwith AI antigens GBS 67, GBS 80, and GBS 104 provided protection tonewborn mice in an active maternal immunization assay. TABLE 14 Activematernal immunization assay for a combination of fragments from GBS 322,GBS 80, GBS 104, and GBS 67 MIX = 322 + FACS (Δ Mean) 80 + 104 + 67 PBSGBS strains Type GBS 80 GBS 67 GBS 322 alive/treated % protectionalive/treated % protection 515 Ia 0 409 227 39/40 97  6/40 15 7357b- Ib91 316 102 19/30 63  1/30 3 DK21 II 0 331 416 25/34 73 17/48 35 5401 II170 618 135 35/40 87  3/37 8 3050 II 43 460 188 48/48 100  1/30 3 COH1III 305 0 130 36/36 100  7/40 17 M781 III 65 0 224 30/40 75  4/39 102603 V 125 105 313 27/33 82 10/35 28 CJB111 V 370 481 63 25/28 89  4/469 JM9130013 VIII 597 83 143 37/39 95  5/40 12 JMU071 VIII 556 79 17044/50 88 18/50 36 NT1169 NT 0 443 213 12/32 37 11/35 31

In fact, the non-AI GBS 322 antigen may itself provide protection tonewborn mice in an active maternal immunization assay. TABLE 16 Activematernal immunization assay for each of GBS 80 and GBS 322 antigens GBS80 GBS 322 Protection Protection FACS (% survival) FACS (% survival) GBSstrains Type Δ Mean antigen ctrl- Δ Mean antigen ctrl- CJB111 V 370 72%40% 63 57% 40% COH1 III 305 76% 10% 130 3% 10% 2603 V 82 22% 34% 313 83%34% 7357b- Ib 91 36% 34% 102 43% 34% 18RS21 II 0 15% 24% 268 84% 24%DK21 II 0 10% 21% 416 67% 25% A909 Ia 0 0% 14% 090 Ia 0 0% 0% H36B Ib105 34% 32%Thus, inclusion of a non-AI protein in an immunogenic composition of theinvention may provide increased protection a mammal.

The immunogenic compositions comprising S. pneumonaie AI polypeptidesmay further secondary SP protein antigens which include (a) any of theSP protein antigens disclosed in WO 02/077021 or U.S. provisionalapplication ______, filed Apr. 20, 2005 (Attorney Docket Number002441.00154), (2) immunogenic portions of the antigens comprising atleast 7 contiguous amino acids, (3) proteins comprising amino acidsequences which retain immunogenicity and which are at least 95%identical to these SP protein antigens (e.g., 95%, 96%, 97%, 98%, 99%,or 99.5% identical), and (4) fusion proteins, including hybrid SPprotein antigens, comprising (1)-(3).

Alternatively, the invention may include an immunogenic compositioncomprising a first and a second Gram positive bacteria non-AI protein,wherein the polynucleotide sequence encoding the sequence of the firstnon-AI protein is less than 90% (i.e., less than 90, 88, 86, 84, 82, 81,78, 76, 74, 72, 70, 65, 60, 55, 50, 45, 40, 35, or 30 percent)homologous than the corresponding sequence in the genome of the secondnon-AI protein.

The compositions of the invention may further comprise one or moreadditional non-Gram positive bacterial antigens, including additionalbacterial, viral or parasitic antigens. The compositions of theinvention may further comprise one or more additional non-GBS antigens,including additional bacterial, viral or parasitic antigens.

In another embodiment, the GBS antigen combinations of the invention arecombined with one or more additional, non-GBS antigens suitable for usein a vaccine designed to protect elderly or immunocomprised individuals.For example, the GBS antigen combinations may be combined with anantigen derived from the group consisting of Enterococcus faecalis,Staphylococcus aureus, Staphylococcus epidermis, Pseudomonas aeruginosa,Legionella pneumophila, Listeria monocytogenes, Neisseria meningitides,influenza, and Parainfluenza virus (‘PIV’).

Where a saccharide or carbohydrate antigen is used, it is preferablyconjugated to a carrier protein in order to enhance immunogenicity {e.g.Ramsay et al. (2001) Lancet 357(9251):195-196; Lindberg (1999) Vaccine17 Suppl 2:S28-36; Buttery & Moxon (2000) J R Coll Physicians Lond34:163-168; Ahmad & Chapnick (1999) Infect Dis Clin North Am 13:113-133,vii.; Goldblatt (1998) J. Med. Microbiol. 47:563-567; European patent 0477 508; U.S. Pat. No. 5,306,492; International patent applicationWO98/42721; Conjugate Vaccines (eds. Cruse et al.) ISBN 3805549326,particularly vol. 10:48-114; and Hermanson (1996) BioconjugateTechniques ISBN: 0123423368 or 012342335X}. Preferred carrier proteinsare bacterial toxins or toxoids, such as diphtheria or tetanus toxoids.The CRM₁₉₇ diphtheria toxoid is particularly preferred {ResearchDisclosure, 453077 (January 2002)}. Other carrier polypeptides includethe N. meningitidis outer membrane protein (EP-A-0372501), syntheticpeptides (EP-A-0378881; EP-A-0427347), heat shock proteins (WO 93/17712;WO 94/03208), pertussis proteins (WO 98/58668; EP A 0471177), protein Dfrom H. influenzae (WO 00/56360), cytokines (WO 91/01146), lymphokines,hormones, growth factors, toxin A or B from C. difficile (WO00/61761),iron-uptake proteins (WO01/72337), etc. Where a mixture comprisescapsular saccharides from both serogroups A and C, it may be preferredthat the ratio (w/w) of MenA saccharide:MenC saccharide is greater than1 (e.g. 2:1, 3:1, 4:1, 5:1, 10:1 or higher). Different saccharides canbe conjugated to the same or different type of carrier protein. Anysuitable conjugation reaction can be used, with any suitable linkerwhere necessary.

Toxic protein antigens may be detoxified where necessary e.g.detoxification of pertussis toxin by chemical and/or genetic means.

Where a diphtheria antigen is included in the composition it ispreferred also to include tetanus antigen and pertussis antigens.Similarly, where a tetanus antigen is included it is preferred also toinclude diphtheria and pertussis antigens. Similarly, where a pertussisantigen is included it is preferred also to include diphtheria andtetanus antigens.

Antigens in the composition will typically be present at a concentrationof at least 1 μg/ml each. In general, the concentration of any givenantigen will be sufficient to elicit an immune response against thatantigen.

As an alternative to using protein antigens in the composition of theinvention, nucleic acid encoding the antigen may be used {e.g. refs.Robinson & Torres (1997) Seminars in Immunology 9:271-283; Donnelly etal. (1997) Annu Rev Immunol 15:617-648; Scott-Taylor & Dalgleish (2000)Expert Opin Investig Drugs 9:471480; Apostolopoulos & Plebanski (2000)Curr Opin Mol Ther 2:441-447; Ilan (I 999) Curr Opin Mol Ther 1:116-120;Dubensky et al. (2000) Mol Med 6:723-732; Robinson & Pertmer (2000) AdvVirus Res 55:1-74; Donnelly et al. (2000) Am J Respir Crit Care Med162(4 Pt 2):S 190-193; and Davis (1999) Mt. Sinai J. Med. 66:84-90}.Protein components of the compositions of the invention may thus bereplaced by nucleic acid (preferably DNA e.g. in the form of a plasmid)that encodes the protein.

Definitions

The term “comprising” means “including” as well as “consisting” e.g. acomposition “comprising” X may consist exclusively of X or may includesomething additional e.g. X+Y.

The term “about” in relation to a numerical value x means, for example,x±10%.

References to a percentage sequence identity between two amino acidsequences means that, when aligned, that percentage of amino acids arethe same in comparing the two sequences. This alignment and the percenthomology or sequence identity can be determined using software programsknown in the art, for example those described in section 7.7.18 ofCurrent Protocols in Molecular Biology (F. M. Ausubel et al., eds.,1987) Supplement 30. A preferred alignment is determined by theSmith-Waterman homology search algorithm using an affine gap search witha gap open penalty of 12 and a gap extension penalty of 2, BLOSUM matrixof 62. The Smith-Waterman homology search algorithm is disclosed inSmith & Waterman (1981) Adv. Appl. Math. 2: 482-489.

The invention is further illustrated, without limitation, by thefollowing examples.

EXAMPLE 1 Binding of an Adhesin Island Surface Protein, GBS 80, toFibrinogen and Fibronectin

This example demonstrates that an Adhesin Island surface protein, GBS 80can bind to fibrinogen and fibronectin.

An enzyme-linked immunosorbent assay (ELISA) was used to analyse the invitro binding ability of recombinant GBS 80 to immobilizedextra-cellular matrix (ECM) proteins but not to bovine serum albumin(BSA). Microtiter plates were coated with ECM proteins (fibrinogen,fibronectin, laminin, collagen type IV) and binding assessed by addingvarying concentrations of a recombinant form of GBS 80, over-expressedand purified from E. coli (FIG. 5A). Plates were then incubatedsequentially with a) mouse anti-GBS 80 primary antibody; b) rabbitanti-mouse AP-conjugated secondary antibody; c) pNPP colorimetricsubstrate. Relative binding was measured by monitoring absorbance at 405nm, using 595 nm as a reference wavelength. FIG. 5 b shows binding ofrecombinant GBS 80 to immobilized ECM proteins (1 μg) as a function ofconcentration of GBS 80. BSA was used as a negative control. Data pointsrepresent the means of OD₄₀₅ values±standard deviation for 3 wells.

Binding of GBS 80 to the tested ECM proteins was found to beconcentration dependent and exhibited saturation kinetics. As is alsoevident from FIG. 5, binding of GBS 80 to fibronectin and fibrinogen wasgreater than binding to laminin and collagen type IV at all theconcentrations tested.

EXAMPLE 2 GBS 80 is Required for Surface Localization of GBS 104

This example demonstrates that co-expression of GBS 80 is required forsurface localization of GBS 104.

The polycistronic nature of the Adhesin Island I mRNA was investigatedthrough reverse transcriptase-PCR (RT-PCR) analysis employing primersdesigned to detect transcripts arising from contiguous genes. Total RNAwas isolated from GBS cultures grown to an optical density at 600 nm(OD₆₀₀) of 0.3 in THB (Todd-Hewitt broth) by the RNeasy Total RNAisolation method (Qiagen) according to the manufacturer's instructions.The absence of contaminating chromosomal DNA was confirmed by failure ofthe gene amplification reactions to generate a product detectable byagarose gel electrophoresis, in the absence of reverse transcriptase.RT-PCR analysis was performed with the Access RT-PCR system (Promega)according to the manufacturer's instructions, employing PCR cyclingtemperatures of 60° C. for annealing and 70° C. for extension.Amplification products were visualized alongside 100-bp DNA markers in2% agarose gels after ethidium bromide staining.

FIG. 5 shows that all the genes are co-transcribed as an operon. Aschematic of the AI-1 operon is shown above the agarose gel analysis ofthe RT-PCR products. Large rectangular arrows indicate the predictedtranscript direction. Primer pairs were selected such as “1-4” cross the3′finish-5′start of successive genes and overlap each gene by at least200 bp. Additionally, “1” crosses a putative rho-independenttranscriptional terminator. “5” is an internal GBS 80 control and “6” isan unrelated control from a highly expressed gene. Lanes: “a”: RNA plusRTase enzyme; “b” RNA without RTase; “c”: genomic DNA control.

In the effort to elucidate the functions of the AI-1 proteins, in framedeletions of all of the genes within the operon have been constructedand the resulting mutants characterized with respect to surface exposureof the encoded antigens (see FIG. 8).

Each in-frame deletion mutation was constructed by splice overlapextension PCR(SOE-PCR) essentially as decribed by Horton et al. [HortonR. M., Z. L. Cai, S. N. Ho, L. R. Pease (1990) Biotechniques 8:528-35]using suitable primers and cloned into the temperature sensitive shuttlevector pJRS233 to replace the wild type copy by allelic exchange[Perez-Casal, J., J. A. Price, et al. (1993) Mol Microbiol 8(5):809-19.]. All plasmid constructions utilized standard molecular biologytechniques, and the identities of DNA fragments generated by PCR wereverified by sequencing. Following SOE-PCR, the resulting mutant DNAfragments were digested with XhoI and EcoRI, and ligated into asimilarly digested pJRS233. The resuting vectors were introduced byelectroporation into the chromosome of 2603 and COH1 GBS strains in athree-step process, essentially as described in Framson et al. [Framson,P. E., A. Nittayajarn, J. Merry, P. Youngman, and C. E. Rubens. (1997)Appl. Environ. Microbiol. 63(9):353947]. Briefly, the vector pJRS233contains an erm gene encoding erythromycin resistance and atemperature-sensitive gram-positive replicon that is active at 30° C.but not at 37° C. Initially, the constructs are electroporated into GBSelectro-competent cells prepared as described by Frameson et al., andtransformants containing free plasmid are selected by their ability togrow at 30° C. on Todd-Hewitt Broth (THB) agar plates containing 1 μg/mlerythromycin. The second step includes a selection step for strains inwhich the plasmid has integrated into the chromosome via a singlerecombination event over the homologous plasmid insert and chromosomesequence by their ability to grow at 37° C. on THB agar mediumcontaining 1 mg/ml erythromycin. In the third step, GBS cells containingthe plasmid integrated within the chromosome (integrants) are seriallypassed in broth culture in the absence of antibiotics at 30° C. Plasmidexcision from the chromosome via a second recombination event over theduplicated target gene sequence either completed the allelic exchange orreconstituted the wild-type genotype. Subsequent loss of the plasmid inthe absence of antibiotic selection pressure resulted in anerythromycin-sensitive phenotype. In order to assess gene replacement ascreening of erythromycin-sensitive colonies was performed by analysisof the target gene PCR amplicons.

FIG. 7 reports a schematic of the IS-1 operon for each knock-out straingenerated, along with the deletion position within the amino acidicsequence. Most data presented here concern the COH1 deletion strains, inwhich the expression of each of the antigens is higher by DNA microarrayanalysis (data not shown) as well as detectable by FACS analysis (seeFIG. 8). The double mutant in 2603 Δ80, Δ104 double mutant wasconstructed by sequential allelic exchanges of the shown alleles.

Immunization Protocol

Immune sera for FACS experiments were obtained as follows.

Groups of 4 CD-1 outbred female mice 6-7 weeks old (Charles RiverLaboratories, Calco Italy) were immunized with the selected GBSantigens, (20 μg of each recombinant GBS antigen), suspended in 100 μlof PBS. Each group received 3 doses at days 0, 21 and 35. Immunizationwas performed through intra-peritoneal injection of the protein with anequal volume of Complete Freund's Adjuvant (CFA) for the first dose andIncomplete Freund's Adjuvant (IFA) for the following two doses. In eachimmunization scheme negative and positive control groups are used.Immune response was monitored by using serum samples taken on day 0 and49.

FACS Analysis

Preparation of paraformaldehyde treated GBS cells and their FACSanalysis were carried out as follows.

GBS serotype COH1 strain cells were grown in Todd Hewitt Broth (THB;Difco Laboratories, Detroit, Mich.) to OD600 nm=0.5. The culture wascentrifuged for 20 minutes at 5000 rpm and bacteria were washed oncewith PBS, resuspended in PBS containing 0.05% paraformaldehyde, andincubated for 1 hours at 37° C. and then overnight at 4° C. 50 μl offixed bacteria (OD600 0.1) were washed once with PBS, resuspended in 20μl of Newborn Calf Serum, (Sigma) and incubated for 20 min. at roomtemperature. The cells were then incubated for 1 hour at 4° C. in 100 μlof preimmune or immune sera, diluted 1:200 in dilution buffer (PBS, 20%Newborn Calf Serum, 0.1% BSA). After centrifugation and washing with 200μl of washing buffer (0.1% BSA in PBS), samples were incubated for 1hour at 4° C. with 50 μl of R-Phicoerytrin conjugated F(ab)2 goatanti-mouse IgG (Jackson ImmunoResearch Laboratories; Inc.), diluted1:100 in dilution buffer. Cells were washed with 200 μl of washingbuffer and resuspended in 200 μl of PBS. Samples were analysed using aFACS Calibur apparatus (Becton Dickinson, Mountain View, Calif.) anddata were analyzed using the Cell Quest Software (Becton Dickinson). Ashift in mean fluorescence intensity of >75 channels compared topreimmune sera from the same mice was considered positive. This cutoffwas determined from the mean plus two standard deviations of shiftsobtained with control sera raised against mock purified recombinantproteins from cultures of E. coli carrying the empty expression vectorand included in every experiment. Artifacts due to bacterial lysis wereexcluded using antisera raised against 6 different known cytoplasmicproteins all of which were negative

FACS data on COH1 single KO mutants for GBS 104 and GBS 80 indicatedthat GBS 80 is required for surface localization of GBS 104.

As shown in FIG. 8, GBS 104 is not surface exposed in the Δ80 strain(second column, bottom), but is present in the whole protein extracts(see FIG. 10). Mean shift values suggest that GBS 104 is partiallyresponsible for GBS 80 surface exposure (Mean shift of GBS 80 is reducedto ˜60% wild-type levels in Δ104), and that GBS 80 is over-expressed inthe complemented strain (mean shift value ˜200% wild-type level). TheΔ80/pGBS 80 strain contains the GBS 80 orf cloned in the shuttle-vectorpAM401 (Wirth, R., F. Y. An, et al. (1986). J Bacteriol 165(3): 831-6).The vector alone does not alter the secretion pattern of GBS 104 (rightcolumn). FACS was performed on mid-log fixed bacteria with mousepolyclonal antibodies as indicated at left. Black peak is pre-immunesera, colored peaks are sera from immunized animals.

EXAMPLE 3 Deletion of GBS 80 Causes Attenuation In Vivo

This example demonstrates that deletion of GBS 80 causes attenuation invivo, suggesting that this protein contributes to bacterial virulence.

By using a mouse animal model, we studied the role of GBS 80 and GBS 104in the virulence of S. agalactiae.

Groups of ten outbred female mice 5-6 week weeks old (Charles RiverLaboratories, Calco Italy) were inoculated intraperitoneally withdifferent dilutions of the mutant strains and LD50 (lethal dose 50) werecalculated according to the method of Reed and Muench [Reed, L. J. andH. Muench (1938). The American Journal of Hygiene 27(3): 493-7]. Aspresented in the table below the number of colony forming units (cfu)counted for both the Δ80 and the Δ80, Δ104 double mutants is about 10fold higher when compared to the wild type strain suggesting thatinactivation of GBS 80 but not GBS 104 is responsible for an attenuationin virulence. This finding indicates that GBS 80 gene in the AI-1 mightcontribute to virulence. TABLE Lethal dose 50% analysis of AI-1 mutantsin the 2603 strain background. LD_(50s) were performed by IP injectionof female CD1 mice at an age of 5-6 weeks. LD_(50s) were calculated bythe method of Reed and Muench (8). GBS strain LD₅₀, cfu Number ofExperiments Wild Type 2603  2 × 10⁸ 4 Δ104 mutant ˜2 × 10⁸ 1 Δ80 mutant2.6 × 10⁹  3 Δ80, Δ104 double mutant ˜2 × 10⁹ 1

EXAMPLE 4 Effect of Adhesin Island Sortase Deletions on Surface AntigenPresentation

This example demonstrates the effect of adhesin island sortase deletionson surface antigen presentation.

FACS analysis results set forth in FIG. 9 show that a deletion insortase SAG0648 prevented GBS 104 from reaching the surface and slightlyreduced the surface exposure of GBS 80 (fourth panel; mean shift value˜60% wild-type COH1). In the double sortase knock-out strain, neitherantigen was surface exposed (far right panel). Either sortase alone wassufficient for GBS 80 to arrive at the bacterial surface (third andfourth columns, top). No effect was seen on surface exposure of antigensGBS 80 or GBS 104 in the ΔGBS 52 strain. Antibodies derived frompurified GBS 52 were either non-specific or were FACS negative for GBS52 (data not shown). FACS analysis was performed as described above (seeEXAMPLE 2).

As shown in FIG. 10, inactivation of GBS 80 has no effect on GBS 104expression as much as GBS 104 knock out doesn't change the total amountGBS 80 expressed. The Western blot of whole protein extracts (strainsnoted above lanes) probed with anti-GBS 80 antisera is shown in panel A.Arrow indicates expected size of GBS 80 (60 kDa). GBS 80 antibodiesrecognize a doublet, the lower band is not present in ΔGBS 80 strains.Panel B shows a Western blot of whole protein extracts probed withanti-GBS 104 antisera. Arrow indicates expected size of GBS 104 (99.4kDa). Protein extracts were prepared from the same bacterial culturesused for FACS (FIGS. 8 and 9). In conclusion, although GBS 104 does notarrive at the surface in the Δ80 strain by FACS (FIG. 8, second column),it is present at approximately wild-type levels in the whole proteinpreps (B, second lane). Approximately 20 μg of each protein extract wasloaded per lane.

Western-Blot Analysis

Aliquots of total protein extract mixed with SDS loading buffer (1×:60mM TRIS-HCl pH 6.8, 5% w/v SDS, 10% v/v glycerin, 0.1% Bromophenol Blue,100 mM DTT) and boiled 5 minutes at 95° C., were loaded on a 12.5%SDS-PAGE precast gel (Biorad). The gel is run using a SDS-PAGE runningbuffer containing 250 mM TRIS, 2.5 mM Glycine and 0.1% SDS. The gel iselectroblotted onto nitrocellulose membrane at 200 mA for 60 minutes.The membrane is blocked for 60 minutes with PBS/0.05% Tween-20 (Sigma),10% skimmed milk powder and incubated O/N at 4° C. with PBS/0.05% Tween20, 1% skimmed milk powder, with the appropriate dilution of the sera.After washing twice with PBS/0.05% Tween, the membrane is incubated for2 hours with peroxidase-conjugated secondary anti-mouse antibody(Amersham) diluted 1:4000. The nitrocellulose is washed three times for10 minutes with PBS/0.05% Tween and once with PBS and thereafterdeveloped by Opti-4CN Substrate Kit (Biorad).

EXAMPLE 5 Binding of Adhesin Island Proteins to Epithelial Cells andEffect of Adhesin Island Proteins on Capacity of GBS to Adhere toEpithelial Cells

This example illustrates the binding of AI proteins to epithelial cellsand the effect of AI proteins on the capacity of GBS to adhere toepithelial cells.

Applicants analysed whether recombinant AI surface proteins GBS 80 orGBS 104 would demonstrate binding to various epithelial cells in a FACSanalysis. Applicants also analysed whether deletion of AI surfaceproteins GBS 80 or GBS 104 would effect the capacity of GBS to adhere toand invade ME180 cervical epithelial cells.

As shown in FIG. 28, deletion of GBS 80 sequence from GBS strain isolate2603 (serotype V) did not affect the capacity of the mutated GBS toadhere to and invade ME180 cervical epithelial cells. Here ME180cervical carcinoma epithelial cells were infected with wild type GBS2603 or GBS 2603 Δ80 isogenic mutant. After two hours of infection,non-adherent bacteria were washed off and infection prolonged for afurther two hours and four hours. In invasion experiments, after eachtime point, was followed by a two hour antibiotic treatment. Cells werethen lysed with 1% saponin and lysates platedon TSA plates. As shown inFIG. 28, there was little difference between the percent invasion orpercent adhesion of wild type and mutant strains up to the four hourtime point.

FIG. 30 repeats this experiment with both Δ104 and Δ80 mutants from adifferent strain isolate. Here, ME180 cervical carcinoma epithelialcells were infected with GBS strain isolate COH (serotype III) wild typeor COH1 ΔGBS 104 or COH1 Δ80 isogenic mutant. After one hour ofinfection, non-adherent bacteria were washed off and the cells werelysed with 1% saponin. The lysates were plated on TSA plates. As shownin FIG. 30, while there was little difference in the percent invasion,there was a significant decrease in the percent association of the Δ104mutant compared to both the wild type and Δ80 mutant.

The affect of AI surface proteins on the ability of GBS to translocatethrough an epithelial monolayer was also analysed. As shown in FIG. 31,a GBS 80 knockout mutant strain partially loses the ability totranslocate through an epithelial monolayer. Here epithelial monolayerswere inoculated with wildtype or knockout mutant in the apical chamberof a transwell system for two hours and then non-adherent bacteria werewashed off. Infection was prolonged for a further two and four hours.Samples were taken from the media of the basolateral side and the numberof colony forming unties measured. Transepithelial electrical resistancemeasured prior to and after infection gave comparable values, indicatingthe maintenance of the integrity of the monolayer. By the six hour timepoint, the Δ80 mutants demonstrated a reduced percent transcytosis.

A similar experiment was conducted with GBS 104 knock out mutants. Here,as shown in FIG. 22, the Δ104 mutants also demonstrated a reducedpercent transcytosis, indicating that the mutant strains translocatethrough an epithelial monolayer less efficiently than their isogenicwild type counterparts.

Applicants also studied the effect of AI proteins on the capacity of aGBS strain to invade J774 macrophage-like cells. Here, J774 cells wereinfected with GBS COH1 wild type or COH1 ΔGBS 104 or COH1 ΔGBS80isogenic mutants. After one hour of infection, non-adherent bacteriawere washed off and intracellular bacteria were recovered at two, fourand six hours post antibiotic treatment. At each time point, cells werelysed with 0.25% Triton X-100 and lysates plated on TSA plates. As shownin FIG. 32, the Δ104 mutant demonstrated a significantly reduced percentinvasion compared to both the wild type and Δ80 mutant.

EXAMPLE 6 Hyperoligomeric Structures Comprising AI Surface Proteins GBS80 and GBS 104

This example illustrates hyperoligomeric structures comprising AIsurface proteins GBS 80 and GBS 104. A GBS isolate COH1 (serotype III)was adapted to increase expression of GBS 80. FIG. 34 presents a regularnegative stain electron micrograph of this mutant; no pilus orhyperoligomeric structures are distinguishable on the surface of thebacteria. When the EM stain is based on anti-GBS 80 antibodies labelledwith 10 or 20 nm gold particles, the presence of GBS 80 throughout thehyperoligomeric structure is clearly indicated (FIGS. 36, 37 and 38). EMstaining against GBS 104 (anti-GBS 104 antibodies labelled with 10 nmgold particles) also reveals the presence of GBS 104 primarily on ornear the surface of the bacteria or potentially associated withbacterial peptidoglycans (FIG. 39). Analysis of this same strain(over-expressing GBS 80) with a combination of both anti-GBS 80 (using20 nm gold particles) and anti-GBS 104 (using 10 nm gold particles)reveals the presence of GBS 104 on the surface and within thehyperoligomeric structures (see FIGS. 40 and 41).

EXAMPLE 7 GBS 80 is Necessary for Polymer Formation and GBS 104 andSortase SAG0648 are Necessary for Efficient Pili Assembly

This example demonstrates that GBS 80 is necessary for formation ofpolymers and that GBS 104 and sortase SAG0648 are necessary forefficient pili assembly. GBS 80 and GBS 104 polymeric assembly wassystematically analyzed in Coh1 strain single knock out mutants of eachof the relevant coding genes in AI-1 (GBS 80, GBS 104, GBS 52, sag0647,and sag0648). FIG. 41 provides Western blots of total protein extracts(strains noted above lanes) probed with either anti-GBS 80 (left panel)sera or anti-GBS 104 sera (right panel) for each of these Coh1 and Coh1knock out strains. (Coh1, wild type Coh1; Δ80, Coh1 with GBS 80 knockedout; Δ104, Coh1 with GBS 104 knocked out; Δ52, Coh1 with GBS 52 knockedout; A647, Coh1 with SAG0647 knocked out; A648, Coh1 with SAG0648knocked out, A647-8, Coh1 with SAG0647 and SAG0648 knocked out;Δ80/pGBS80, Coh1 with GBS 80 knocked out but complemented with a highcopy number plasmid expressing GBS 80. Asterisks identify the monomer ofGBS 80 and GBS 104.)

The smear of immunoreactive material observed in the wild type strain,along with its disappearance in Δ80 and Δ104 mutants, is consistent withthe notion that such high molecular weight structures are composed ofcovalently linked (SDS-resistant) GBS 80 and GBS 104 subunits. Theimmunoblotting with both anti-GBS 80 (α-GBS 80) and anti-GBS 104 (α-GBS104) revealed that deletion of sortase SAG0648 also interferes with theassembly of high molecular weight species, whereas the knock out mutantof the second sortase (SAG0647), even if somehow reduced, stillmaintains the ability to form polymeric structures.

Total extracts form GBS were prepared as follows. Bacteria were grown in50 ml of Todd-Hewitt broth (Difco) to an OD_(600nm) of 0.5-0.6 andsuccessively pelleted. After two washes in PBS the pellet wasresuspended and incubated 3 hours at 37° C. with mutanolisin. Cells werethen lysed with at least three freezing-thawing cycles in dry ice and a37° C. bath. The lysate was then centrifuged to eliminate the cellulardebris and the supernatant was quantified. Approximately 40 μg of eachprotein extract was separated on SDS-PAGE. The gel was then subjected toimmunoblotting with mice antisera and detected with chemiluminescence.

EXAMPLE 8 GBS 80 is Polymerized by an AI-2 Sortase

This example illustrates that GBS 80 can be polymerized not only by AI-1sortases, but also by AI-2 sortases. FIG. 42 shows total cell extractimmunoblots of GBS 515 strain, which lacks AI-1. The left panel, wherean anti-GBS 67 sera was used, shows that GBS 67 from AI-2 is assembledinto high-molecular weight-complexes, suggesting the formation of asecond type of pilus. The same high molecular structure is observed whenGBS 80 is highly expressed by reintroducing the gene within a plasmid(pGBS 80). By using anti-GBS 80 (right panel) sera on the same extracts,again it is observed that, with GBS 80 over expression (515/pGBS 80), ahigh-molecular weight structure is assembled. This implies that, in theabsence of AI-1 sortases, AI-2 sortases (SAG1405 and SAG1406) cancomplement the lacking function, still being able to assemble GBS 80 ina pilus structure.

EXAMPLE 9 Coh1 Produces a High Molecular Weight Molecule, the GBS 80Pilin

This example illustrates that Coh1 produces a high molecular weightmolecule, greater than 1000 kDa, which is the GBS 80 pilin. FIG. 43provides silver-stained electrophoretic gels that show that Coh1produces two macromolecules. One of these macromolecules disappears inthe Coh1 GBS 80 knock out cells, but does not disappear in the Coh1 GBS52 knock out mutant cells. The last two lanes on the right were loadedwith 15 times the amount loaded in the other lanes. This was done inorder to be able to count the bands. By doing this, a conservative sizeestimate of the top bands was calculated by starting at 240 kDa andconsidering each of 14 higher bands as the result of consecutiveadditions of a GBS 80 monomer.

Coh1, wild type Coh1; Δ80, Coh1 cells with GBS 80 knocked out; Δ52, Coh1cells with GBS 52 knocked out; Δ80/pGBS 80, Coh1 cells with GBS 80knocked out and complemented with a high copy number constructexpressing GBS 80.

EXAMPLE 10 GBS 52 is a Minor Component of the GBS Pilus

This example illustrates that GBS 52 is present in the GBS pilus and isa minor component of the pilus. FIG. 45 shows an immunoblot of totalcell extracts from a GBS Coh1 strain and a GBS Coh1 strain knocked outfor GBS 52 (Δ52). The total cell extracts were immunoblotted anti-GBS 80antisera (left) and anti-GBS 52 antisera (right). Immunoblotting wasperformed using a 3-8% Tris-acetate polyacrylamide gel (Invitrogen)which provided excellent separation of large molecular weight proteins(see FIG. 41). When the gel was incubated with anti-GBS 80 sera, thebands from the Coh1 wild-type strain appeared shifted when compared tothe Δ52 mutant. This observation indicated a different size of the piluspolymeric components in the two strains. When the same gel was strippedand incubated with anti-GBS 52 sera the high-molecular subunits in theCoh1 wild-type strain showed similar molecular size of those in thecorrespondent lane in the left panel. These findings confirmed that GBS52 is indeed associated with GBS 80 macro-molecular structures butrepresents a minor component of the GBS pilus.

EXAMPLE 11 Pilus Structures are Present in the Supernatant of GBSBacterial Cultures

This example illustrates that the pilus structure assembled in Coh1 GBSis present in the supernatant of a bacterial cell culture. FIG. 46 showsan immunoblot where the protein extract of the supernatant from culturesof different GBS mutant strains (117=Coh1 GBS 80 knockout; 159=Coh1 GBS104 knockout; 202=Coh1 GBS 52 knockout; 206=Coh1 GBS sag0647 knockout;208=Coh1 GBS sag0648 knockout; 197=Coh1 GBS sag0647/sag0648 knockout;179=Coh1 GBS 80 knockout complemented with a high copy plasmidexpressing GBS 80). GBS 80 antisera detects the presence of pilusstructures in the appropriate Coh1 strains.

The protein extract was prepared as follows. Bacteria were grown in THBto an OD_(600nm) of 0.5-0.6 and the supernatant was separated from thecells by centrifugation. The supernatant was then filtered (527 0.2 μm)and 1 ml was added with 60% TCA for protein precipitation.

GBS pili were also extracted from the fraction of surface-exposedproteins in Coh1 strain and its GBS 80 knock out mutant as describedhereafter. Bacteria were grown to an OD_(600nm) of 0.6 in 50 ml of THBat 37° C. Cells were washed once with PBS and the pellet was thenresuspended in 0.1 M KPO4 pH 6.2, 40% sucrose, 10 mM MgCl2, 400 U/mlmutanolysin and incubated 3 hours at 37° C. Protoplasts were separatedby centrifugation and the supernatant was recovered and its proteincontent measured.

In order to study the dynamics of pilus production during differentgrowth phases, 1 ml supernatant of a culture at different OD_(600nm) wasTCA precipitated and loaded onto a 3-8% SDS-PAGE as described before.FIG. 47 shows the corresponding Western blot with GBS 80 anti-sera. Thefirst group of lanes (left five sample lanes) refer to a Coh1 straingrowth (OD_(600nm) are noted above the lanes) whereas the second groupof lanes (right five samples) are from a GBS 80 knock out strain overexpressing GBS 80. The experiment shows that pilus macromolecularstructures can be found in the supernatant in all of the growth phasestested.

EXAMPLE 12 In GBS Strain Coh1, only GBS 80 and a Sortase (sag0647 orsag0648) is Required for Polymerization

This example describes requirements for pilus formation in Coh1. FIG. 48shows a Western blot of total protein extracts (prepared as describedbefore) using anti-GBS 80 sera on Coh1 clones. (Coh1, wild type Coh1;Δ104, Coh1 knocked out for GBS 104, Δ647, Coh1 knocked out for sag0647,Δ648, Coh1 knocked for sag0648, Δ647-8, Coh1 knocked out for sag0647 andsag0648; 515, wild type bacterial strain 515, which lacks an AI-1; p80 ahigh copy number plasmid which expresses GBS 80.) The data show thatonly the double sortase mutant is unable to polymerize GBS 80 indicatingthat the ‘conditio sine qua non’ for pilus polymerization is theco-existence of GBS 80 with at least one sortase. This result leads to areasonable assumption that SAG1405 and SAG1406 are responsible forpolymerization in this strain.

EXAMPLE 13 GBS 80 can be Expressed in L. lactis Under its Own Promoterand Terminator Sequences

This example demonstrates that L. lactis, a non-pathogenic bacterium,can express GBS AI polypeptides such as GBS 80. L. lactis M1363 (J.Bacteriol. 154 (1983):1-9) was transformed with a construct encoding GBS80. Briefly, the construct was prepared by cloning a DNA fragmentcontaining the gene coding for GBS 80 under its own promoter andterminator sequences into plasmid pAM401 (a shuttle vector for E. coliand other Gram positive bacteria; J. Bacteriol. 163 (1986):831-836).Total extracts of the transformed bacteria in log phase were separatedon SDS-PAGE, transferred to membranes, and incubated with antiserumagainst GBS 80. A polypeptide corresponding to the molecular weight ofGBS 80 was detected in the lanes containing total extracts of L. lactistransformed with the GBS 80 construct. See FIGS. 133A and 133B, lanes 6and 7. This same polypeptide was not detected in the lane containingtotal extracts of L. lactis not transformed with the GBS 80 construct,lane 9. This example shows that L. lactis can express GBS 80 under itsown promoter and terminator.

EXAMPLE 14 L. lactis Modified to Express GBS AI-1 Under the GBS 80Promoter and Terminator Sequences Expresses GBS 80 in PolymericStructures

This example demonstrates the ability of L. lactis to express GBS AI-1polypeptides and to incorporate at least some of the polypeptides intooligomers. L. lactis was transformed with a construct containing thegenes encoding GBS AI-1 polypeptides. Briefly, the construct wasprepared by cloning a DNA fragment containing the genes for GBS 80, GBS52, SAG0647, SAG0648, and GBS 104 under the GBS 80 promoter andterminator sequences into construct pAM401. The construct wastransformed into L. lactis M1363. Total extracts of log phasetransformed bacteria were separated on reducing SDS-PAGE, transferred tomembranes, and incubated with antiserum against GBS 80. A polypeptidewith a molecular weight corresponding to the molecular weight of GBS 80was detected in the lanes containing L. lactis transformed with the GBSAI-1 encoding construct. See FIG. 134, lane 2. In addition, the samelane also showed immunoreactivity of polypeptides having highermolecular weights than the polypeptide having the molecular weight ofGBS 80. These higher molecular weight polypeptides are likely oligomersof GBS 80. Oligomers of similar molecular weights were also observed ona Western blot of the culture supernatant of the transformed L. lactis.See lane 4 of FIG. 135. Thus, this example shows that L. lactistransformed to express GBS AI-1 can efficiently polymerize GBS 80 in theform of a pilus. This pilus structure can likely be purified from eitherthe cell culture supernatant or cell extracts.

EXAMPLE 15 Cloning and Expression of S. pneumoniae Sp0462

This example describes the production of a clone encoding a Sp0462polypeptide and expression of the clone. To produce a clone encodingSp0462, the open reading frame encoding Sp0462 was amplified usingprimers that annealed within the full-length Sp0462 open reading framesequence. FIG. 150A provides a 893 amino acid sequence of Sp0462. Theprimers used to produce a clone encoding the Sp0462 polypeptide areshown in FIG. 150B. These primers annealed to the nucleotide sequencesencoding the amino acid residues indicated by underlining in FIG. 150A.Amplification of the open reading frame encoding Sp0462 using theseprimers produced the amplicon shown at lane 2 of the agarose gelprovided in FIG. 160. The Sp0462 clone encodes amino acid residues38-862 of the 893 amino acid residue Sp0462 protein; the italicizedresidues in FIG. 150A were eliminated. FIG. 151A provides a schematicdepiction of the recombinant Sp0462 polypeptide. FIG. 151B shows aschematic depiction of the full-length Sp0462 polypeptide. Both therecombinant Sp0462 encoded by the clone and the full-length Sp0462protein have two collagen binding protein type B (Cna B) domains and avon Hillebrand factor A (vWA) domain. The cloned recombinant Sp0462lacks the LPXTG motif present in the full-length Sp0462 protein. Westernblot analysis for expression of the Sp0462 clone did not result indetection of polypeptides with serum obtained from S.pneumoniae-infected patients (FIG. 152A) or GBS 80 antiserum (FIG.152B).

EXAMPLE 16 Cloning and Expression of S. pneumoniae Sp0463

This example describes the production of a clone encoding a Sp0463polypeptide and detection of recombinant Sp0463 polypeptide expressedfrom the clone. To produce a clone encoding Sp0463, the open readingframe encoding Sp0463 was amplified using primers that annealed withinthe full-length Sp0463 open reading frame sequence. FIG. 153A provides a665 amino acid sequence of Sp0463. The primers used to produce the cloneencoding Sp0463 polypeptide are shown in FIG. 153B. These primersannealed to the nucleotide sequences encoding the amino acid residuesindicated by underlining in FIG. 153A. Amplification of the open readingframe encoding Sp0463 using these primers produced the amplicon shown atlane 3 of the agarose gel provided in FIG. 160. The Sp0463 clone encodesamino acid residues 23-627 of the 665 amino acid residue Sp0463 protein;the italicized residues in FIG. 153A were eliminated. FIG. 154A providesa schematic depiction of the recombinant Sp0463 polypeptide. FIG. 154Bshows a schematic depiction of the full-length Sp0463 polypeptide. Boththe recombinant Sp0463 encoded by the clone and the full-length Sp0463protein have a Cna B domain and an E box motif. The cloned recombinantSp0463 lacks the LPXTG motif present in the full-length Sp0463 protein.Expression of the Sp0463 clone resulted in the detection of a 60 kDpolypeptide, the expected molecular weight of the recombinant Sp0463polypeptide, by Western blot analysis. See FIG. 155.

EXAMPLE 17 Cloning and Expression of S. pneumoniae Sp0464

This example describes the production of a clone encoding a Sp0464polypeptide and detection of recombinant Sp0464 polypeptide expressedfrom the clone. To produce a clone encoding Sp0464, the open readingframe encoding Sp0464 was amplified using primers that annealed eitherwithin the full-length Sp0464 open reading frame sequence. FIG. 157Aprovides a 393 amino acid sequence of Sp0464. The primers used toproduce a clone encoding the Sp0464 polypeptide are shown in FIG. 157B.These primers annealed to the nucleotide sequences encoding the aminoacid residues indicated by underlining in FIG. 157A. Amplification ofthe open reading frame encoding Sp0464 using these primers produced theamplicon shown at lane 4 of the agarose gel provided in FIG. 160. TheSp0464 clone encodes amino acid residues 19-356 of the 393 amino acidresidue Sp0464 protein; the italicized residues in FIG. 157A wereeliminated. FIG. 158A provides a schematic depiction of the recombinantSp0464 polypeptide. FIG. 158B shows a schematic depiction of thefull-length Sp0464 polypeptide. Both the recombinant Sp0464 encoded bythe clone and the full-length Sp0464 protein have two Cna B domains. Thecloned recombinant Sp0464 lacks the LPXTG motif present in thefull-length Sp0464 protein. Expression of the Sp0464 clone resulted inthe detection of a 38 kD polypeptide, the expected molecular weight ofthe recombinant Sp0464 polypeptide, by Western blot analysis. See FIG.159.

EXAMPLE 18 Intranasal Immunization of Mice with Recombinant L. lactisExpressing GBS 80 and Subsequent Challenge

This example describes a method of intranasally immunizing mice using L.lactis that express GBS 80. Intranasal immunization consisted of 3 dosesat days 0, 14 and 28, each dose administered in three consecutive days.Each day, groups of 3 CD-1 outbred female mice 6-7 weeks old (CharlesRiver Laboratories, Calco Italy) were immunized intranasally with 10⁹ or10¹⁰ CFU of the recombinant Lactococcus lactis suspended in 20 μl ofPBS. In each immunization scheme negative (wild-type L. lactis) andpositive (recombinant GBS80) control groups were used. The immuneresponse of the dams was monitored by using serum samples taken on day 0and 49. The female mice were bred 2-7 days after the last immunization(at approximately t=36-37), and typically had a gestation period of 21days. Within 48 hours of birth, the pups were challenged via I.P. withGBS in a dose approximately equal to an amount which would be sufficientto kill 90% of immunized pups (as determined by empirical data gatheredfrom PBS control groups). The GBS challenge dose is preferablyadministered in 50 ml of THB medium. Preferably, the pup challenge takesplace at 56 to 61 days after the first immunization. The challengeinocula were prepared starting from frozen cultures diluted to theappropriate concentration with THB prior to use. Survival of pups wasmonitored for 5 days after challenge.

EXAMPLE 19 Subcutaneous Immunization of Mice with Recombinant L. lactisExpressing GBS 80 and Subsequent Challenge

This example describes a method of subcutaneous immunization mice usingL. lactis that express GBS 80. Subcutaneous immunization consists of 3doses at days 0, 14 and 28. Groups of 3 CD-1 outbred female mice 6-7weeks old (Charles River Laboratories, Calco Italy) were injectedsubcutaneously with 10⁹ or 10¹⁰ CFU of the recombinant Lactococcuslactis suspended in 100 μl of PBS. In each immunization scheme, negative(wild-type L. lactis) and positive (recombinant GBS80) control groupswere used. The immune response of the dams was monitored by using serumsamples taken on day 0 and 49. The female mice were bred 2-7 days afterthe last immunization (at approximately t=36-37), and typically had agestation period of 21 days. Within 48 hours of birth, the pups werechallenged via I.P. with GBS in a dose approximately equal to an amountwhich would be sufficient to kill 90% of immunized pups (as determinedby empirical data gathered from PBS control groups). The GBS challengedose is preferably administered in 50 ml of THB medium. Preferably, thepup challenge takes place at 56 to 61 days after the first immunization.The challenge inocula were prepared starting from frozen culturesdiluted to the appropriate concentration with THB prior to use. Survivalof pups was monitored for 5 days after challenge.

EXAMPLE 20 Immunization of Mice with GAS AI Polypeptides and SubsequentIntranasal Challenge

This example describes a method of immunizing mice with GAS AIpolypeptides and subsequently intranasally challenging the mice with GASbacteria. Groups of 10 CD1 female mice aged between 6 and 7 weeks areimmunized with a combination of GAS antigens of the invention GAS 15,GAS 16, and GAS 18, (15 μg of each recombinant antigen, derived from M1strain SF370) or L. lactis expressing the M1 strain SF370 adhesinisland, suspended in 100 μl of suitable solution. Each group receives 3doses at days 0, 21 and 45. Immunization is performed throughsubcutaneous or intraperitoneal injection for the GAS 15, GAS 16, GAS 18protein combination. The protein combination is administered with anequal volume of Complete Freund's Adjuvant (CFA) for the first dose andIncomplete Freund's Adjuvant (IFA) for the following two doses.Immunization is performed intranasally for the L. lactis expressing theM1 strain SF370 adhesin island. In each immunization scheme negative andpositive control groups are used.

The negative control group for the mice immunized with the GAS 15, GAS16, GAS 18 protein combination included mice immunized with PBS. Thenegative control group for the mice immunized with L. lactis expressingthe M1 strain SF370 adhesin island, included mice immunized with eitherwildtype L. lactis or L. lactis transformed with the pAM401 expressionvector lacking any cloned adhesin island sequence.

The positive control groups included mice immunized with purified M1strain SF370 M protein.

Immunized mice are then anaesthetized with Zoletil and challengedintranasally with a 25 μL suspension containing 1.2×10⁶ or 1.2×10⁸ CFUof ISS 3348 in THB. Animals are observed daily and checked for survival.

EXAMPLE 21 Active Maternal Immunization Assay

As used herein, an Active Maternal Immunization assay refers to an invivo protection assay where female mice are immunized with the testantigen composition. The female mice are then bred and their pups arechallenged with a lethal dose of GBS. Serum titers of the female miceduring the immunization schedule are measured as well as the survivaltime of the pups after challenge.

Mouse Immunization

Specifically, groups of 4 CD-1 outbred female mice 6-8 weeks old(Charles River Laboratories, Calco Italy) are immunized with one or moreGBS antigens, (20 μg of each recombinant GBS antigen), suspended in 100μl of PBS. Each group receives 3 doses at days 0, 21 and 35.Immunization is performed through intra-peritoneal injection of theprotein with an equal volume of Complete Freund's Adjuvant (CFA) for thefirst dose and Incomplete Freund's Adjuvant (IFA) for the following twodoses. In each immunization scheme negative and positive control groupsare used.

Immune response is monitored by using serum samples taken on day 0 and49. The sera are analyzed as pools from each group of mice.

Active Maternal Immunization

A maternal immunization/neonatal pup challenge model of GBS infectionwas used to verify the protective efficacy of the antigens in mice. Themouse protection study was adapted from Rodewald et al. (Rodewald et al.J. Infect. Diseases 166, 635 (1992)). In brief, CD-1 female mice (6-8weeks old) were immunized before breeding, as described above. The micereceived 20 μg of protein per dose when immunized with a single antigenand 60 μg of protein per dose (15 μg of each antigen) when immunizedwith the combination of antigens. Mice were bred 2-7 days after the lastimmunization. Within 48 h of birth, pups were injected intraperitoneallywith 50 μl of GBS culture. Challenge inocula were prepared starting fromfrozen cultures diluted to the appropriate concentration with THB beforeuse. In preliminary experiments (not shown), the challenge doses per pupfor each strain tested were determined to cause 90% lethality. Survivalof pups was monitored for 2 days after challenge. Protection wascalculated as (percentage deadControl minus percentage deadVaccine)divided by percentage deadControl multiplied by 100. Data were evaluatedfor statistical significance by Fisher's exact test.

EMBODIMENTS OF THE INVENTION

The invention encompasses, but is not limited to, the embodimentsenumerated below.

1. An immunogenic composition comprising a purified Group BStreptococcus (GBS) adhesin island (AI) polypeptide in oligomeric form.

2. The immunogenic composition of embodiment 1 wherein the GBS AIpolypeptide is selected from a GBS AI-1.

3. The immunogenic composition of embodiment 1 wherein the GBS AIpolypeptide is selected from a GBS AI-2.

1. An immunogenic composition comprising a purified Group BStreptococcus (GBS) adhesin island (AI) polypeptide in oligomeric form.

2. The immunogenic composition of embodiment 1 wherein the GBS AIpolypeptide is selected from a GBS AI-1.

3. The immunogenic composition of embodiment 1 wherein the GBS AIpolypeptide is selected from a GBS AI-2.

4. The immunogenic composition of any of embodiments 1-3 wherein the GBSAI polypeptide comprises a sortase substrate motif.

5. The immunogenic composition of embodiment 4 wherein the sortasesubstrate motif is an LPXTG motif.

6. The immunogenic composition of embodiment 5 wherein the LPXTG motifis represented by the amino acid sequence XPXTG, wherein the X at aminoacid position 1 is an L, an I, or an F and the X at amino acid position3 is any amino acid residue.

7. The immunogenic composition of any one of embodiments 1-3 wherein theGBS AI polypeptide affects the ability of GBS bacteria to adhere toepithelial cells.

8. The immunogenic composition of any one of embodiments 1-3 wherein theGBS AI polypeptide affects the ability of GBS bacteria to invadeepithelial cells.

9. The immunogenic composition of any one of embodiments 1-3 wherein theGBS AI polypeptide affects the ability of GBS bacteria to translocatethrough an epithelial cell layer.

10. The immunogenic composition of any one of embodiments 1-3 whereinthe GBS AI polypeptide is capable of associating with an epithelial cellsurface.

11. The immunogenic composition of embodiment 10 wherein the associatingwith an epithelial cell surface is binding to the epithelial cellsurface.

12. The immunogenic composition of any of embodiments 1-3 wherein theGBS AI polypeptide is a full-length GBS AI protein.

13. The immunogenic composition of any of embodiments 1-3 wherein theGBS AI polypeptide is a fragment of a full-length GBS AI protein.

14. The immunogenic composition of embodiment 13 wherein the fragmentcomprises at least 7 contiguous amino acid residues of the GBS AIprotein.

15. The immunogenic composition of embodiment 2 wherein the GBS AIpolypeptide is selected from the group consisting of GBS 80, GBS 104,GBS 52, and fragments thereof.

16. The immunogenic composition of embodiment 3 wherein the GBS AIpolypeptide is selected from the group consisting of GBS 59, GBS 67, GBS150, 01521, 01523, 01524, and fragments thereof.

17. The immunogenic composition of embodiment 15 wherein the GBS AIpolypeptide is GBS 80.

18. The immunogenic composition of any of embodiments 1-3 or 15-17wherein the oligomeric form is a hyperoligomer.

19. The immunogenic composition of any of embodiments 1-3, or 15-17further comprising a Gram positive bacterium antigen not associated withan AI.

20. The immunogenic composition of embodiment 19 wherein the antigen isselected from the group consisting of GBS 322 and GBS 276.

21. The immunogenic composition of embodiment 20 wherein the antigen isGBS 322.

22. An immunogenic composition comprising a purified Gram positivebacteria adhesin island (AI) polypeptide in an oligomeric form.

23. The immunogenic composition of embodiment 22 wherein the Grampositive bacteria is of a genus selected from the group consisting ofStreptococcus, Enterococcus, Staphylococcus, or Listeria.

24. The immunogenic composition of embodiment 23 wherein the Grampositive bacteria is of the genus Streptococcus.

25. The immunogenic composition of any of embodiments 22-24 wherein theGram positive bacteria AI polypeptide comprises a sortase substratemotif.

26. The immunogenic composition of embodiment 25 wherein the sortasesubstrate motif is an LPXTG motif.

27. The immunogenic composition of any one of embodiments 22-24 whereinthe Gram positive bacteria AI polypeptide affects the ability of Grampositive bacteria to adhere to epithelial cells.

28. The immunogenic composition of any one of embodiments 22-24 whereinthe Gram positive bacteria AI polypeptide affects the ability of Grampositive bacteria to invade epithelial cells.

29. The immunogenic composition of any one of embodiments 22-24 whereinthe Gram positive bacteria AI polypeptide affects the ability of Grampositive bacteria to translocate through an epithelial cell layer.

30. The immunogenic composition of any one of embodiments 22-24 whereinthe Gram positive bacteria AI polypeptide is capable of associating withan epithelial cell surface.

31. The immunogenic composition of embodiment 30 wherein the associatingwith an epithelial cell surface is binding to the epithelial cellsurface.

32. The immunogenic composition of any of embodiments 22-24 wherein theGram positive bacteria AI polypeptide is a full-length Gram positivebacteria AI protein.

33. The immunogenic composition of any of embodiments 22-24 wherein theGram positive bacteria AI polypeptide is a fragment of a full-lengthGram positive bacteria AI protein.

34. The immunogenic composition of embodiment 33 wherein the fragmentcomprises at least 7 contiguous amino acid residues of the Gram positivebacteria AI protein.

35. The immunogenic composition of embodiment 24 wherein the genusStreptococcus bacteria is Group A Streptococcus (GAS) bacteria and theGram positive bacteria AI polypeptide is a GAS AI polypeptide.

36. The immunogenic composition of embodiment 35 wherein the GAS AIpolypeptide is selected from a GAS AI-1.

37. The immunogenic composition of embodiment 35 wherein the GAS AIpolypeptide is selected from a GAS AI-2.

38. The immunogenic composition of embodiment 35 wherein the GAS AIpolypeptide is selected from a GAS AI-3.

39. The immunogenic composition of embodiment 35 wherein the GAS AIpolypeptide is selected from a GAS AI-4.

40. The immunogenic composition of any of embodiments 35-39 wherein theGAS AI polypeptide comprises a sortase substrate motif.

41. The immunogenic composition of embodiment 40 wherein the sortasesubstrate motif is an LPXTG motif.

42. The immunogenic composition of embodiment 41 wherein the LPXTG motifis represented by XXXXG, wherein the X at the first amino acid positionis an L, a V, an E, or a Q, wherein the X at the second amino acidposition is P if the X at the first amino acid position is an L, the Xat the second amino acid position is a V if the X at the first aminoacid position is an E or a Q, or the X at the second amino acid positionis a V or a P if the X at the first amino acid position is a V, whereinthe X at the third amino acid position is any amino acid residue, andwherein the X at the fourth amino acid position is a T if the X at thefirst amino acid position is a V, an E, or a Q, or the X at the fourthamino acid position is a T, an S, or an A if the X at the first aminoacid position is an L.

43. The immunogenic composition of any one of embodiments 35-39 whereinthe GAS AI polypeptide affects the ability of GAS bacteria to adhere toepithelial cells.

44. The immunogenic composition of any one of embodiments 35-39 whereinthe GAS AI polypeptide affects the ability of GAS bacteria to invadeepithelial cells.

45. The immunogenic composition of any one of embodiments 35-39 whereinthe GAS AI polypeptide affects the ability of GAS bacteria totranslocate through an epithelial cell layer.

46. The immunogenic composition of any one of embodiments 35-39 whereinthe GAS AI polypeptide is capable of associating with an epithelial cellsurface.

47. The immunogenic composition of embodiment 46 wherein the associatingwith an epithelial cell surface is binding to the epithelial cellsurface.

48. The immunogenic composition of any of embodiments 35-39 wherein theGAS AI polypeptide is a full-length GAS AI protein.

49. The immunogenic composition of any of embodiments 35-39 wherein theGAS AI polypeptide is a fragment of a full-length GAS AI protein.

50. The immunogenic composition of embodiment 49 wherein the fragmentcomprises at least 7 contiguous amino acid residues of the GAS AIprotein.

51. The immunogenic composition of embodiment 36 wherein the GAS AI-1polypeptide is selected from the group consisting of M6_Spy0157,M6_Spy0159, M6_Spy0160, CDC SS 410_fimbrial, ISS3650_fimbrial,DSM2071_fimbrial, and fragments thereof.

52. The immunogenic composition of embodiment 37 wherein the GAS AI-2polypeptide is selected from the group consisting of GAS 15, GAS 16, GAS18, and fragments thereof.

53. The immunogenic composition of embodiment 38 wherein the GAS AI-3polypeptide is selected from the group consisting of SpyM3_(—)0098,SpyM3_(—)0100, SpyM3_(—)0102, SpyM3_(—)0104, SPs0100, SPs0102, SPs0104,SPs0106, orf78, orf80, orf82, orf84, spyM18_(—)0126, spyM18_(—)0128,spyM18_(—)0130, spyM18_(—)0132, SpyoM01000156, SpyoM01000155,SpyoM01000154, SpyoM01000153, SpyoM01000152, SpyoM01000151,SpyoM01000150, SpyoM01000149, ISS3040_fimbrial, ISS3776_fimbrial,ISS4959_fimbrial, and fragments thereof.

53. The immunogenic composition of embodiment 39 wherein the GAS AI-4polypeptide is selected from the group consisting of 19224134, 19224135,19224137, 19224139, 19224141, 20010296_fimbrial, 20020069_fimbrial, CDCSS 635_fimbrial, ISS4883_fimbrial, ISS4538_fimbrial, and fragmentsthereof.

54. The immunogenic composition of embodiment 24 wherein theStreptococcus bacteria is Streptococcus pneumoniae and the Gram positivebacteria AI polypeptide is a S. pneumoniae AI polypeptide.

55. The immunogenic composition of embodiment 54 wherein the S.pneumoniae AI polypeptide comprises a sortase substrate motif.

56. The immunogenic composition of embodiment 55 wherein the sortasesubstrate motif is an LPXTG motif.

57. The immunogenic composition of embodiment 54 wherein the S.pneumoniae AI polypeptide affects the ability of S. pneumoniae to adhereto epithelial cells.

58. The immunogenic composition of embodiment 54 wherein the S.pneumoniae AI polypeptide affects the ability of S. pneumoniae to invadeepithelial cells.

59. The immunogenic composition of embodiment 54 wherein the S.pneumoniae AI polypeptide affects the ability of S. pneumoniae totranslocate through an epithelial cell layer.

60. The immunogenic composition of embodiment 54 wherein the S.pneumoniae AI polypeptide is capable of associating with an epithelialcell surface.

61. The immunogenic composition of embodiment 60 wherein the associatingwith an epithelial cell surface is binding to the epithelial cellsurface.

62. The immunogenic composition of embodiment 54 wherein the S.pneumoniae AI polypeptide is a full-length S. pneumoniae AI protein.

63. The immunogenic composition of embodiment 54 wherein the S.pneumoniae AI polypeptide is a fragment of a full-length S. pneumoniaeAI protein.

64. The immunogenic composition of embodiment 63 wherein the fragmentcomprises at least 7 contiguous amino acid residues of the S. pneumoniaeAI protein.

65. The immunogenic composition of embodiment 54 wherein the S.pneumoniae AI polypeptide is selected from the group consisting ofSP0462, SP0463, SP0464, orf3_(—)670, orf4_(—)670, orf5_(—)670,ORF3_(—)14CSR, ORF4_(—)14CSR, ORF5_(—)14CSR, ORF3_(—)19AH, ORF4_(—)19AH,ORF5_(—)19AH, ORF3_(—)19FTW, ORF4_(—)19FTW, ORF5_(—)19FTW, ORF3_(—)23FP,ORF4_(—)23FP, ORF5_(—)23FP, ORF3_(—)23FTW, ORF4_(—)23FTW, ORF5_(—)23FTW,ORF3_(—)6BF, ORF4_(—)6BF, ORF5_(—)6BF, ORF3_(—)6BSP, ORF4_(—)6BSP,ORF5_(—)6BSP, ORF3_(—)9VSP, ORF4_(—)9VSP, ORF5_(—)9VSP, and fragmentsthereof.

66. The immunogenic composition of any one of embodiments 22-24, 35-39,51-54, or 65 wherein the oligomeric form is a hyperoligomer.

67. The immunogenic composition of any one of embodiments 22-24, 35-39,51-54, or 65 further comprising a Gram positive bacteria antigen notassociated with an AI.

68. The immunogenic composition of embodiment 67 wherein the antigen isselected from the group consisting of GBS 322 and GBS 276.

69. An immunogenic composition comprising a first and a second Group BStreptococcus (GBS) adhesin island (AI) polypeptide.

70. The immunogenic composition of embodiment 69 wherein a full-lengthpolynucleotide sequence encoding for the first GBS AI polypeptide is notpresent in a GBS bacteria genome comprising a polynucleotide sequenceencoding for the second GBS AI polypeptide.

71. The immunogenic composition of embodiment 69 wherein polynucleotidesencoding the first and the second GBS AI polypeptide are each present ingenomes of more than one GBS serotype and strain isolate.

72. The immunogenic composition of embodiment 69 wherein the first GBSAI polypeptide is encoded by a GBS AI-1.

73. The immunogenic composition of embodiment 69 wherein the first GBSAI polypeptide is encoded by a GBS AI-2.

74. The immunogenic composition of embodiment 72 wherein the second GBSAI polypeptide is encoded by a GBS AI-2.

75. The immunogenic composition of embodiment 73 wherein the second GBSAI polypeptide is encoded by a GBS AI-2.

76. The immunogenic composition of embodiment 72 wherein the second GBSAI polypeptide is encoded by a GBS AI-1.

77. The immunogenic composition of embodiment 73 wherein the second GBSAI polypeptide is encoded by a GBS AI-1.

78. The immunogenic composition of embodiment 72 wherein the first GBSAI polypeptide is selected from the group consisting of GBS 80, GBS 104,GBS 52, and fragments thereof.

79. The immunogenic composition of embodiment 73 wherein the first GBSAI polypeptide is selected from the group consisting of GBS 59, GBS 67,GBS 150, 01521, 01523, 01524, and fragments thereof.

80. The immunogenic composition of embodiment 74 or 75 wherein thesecond GBS AI polypeptide is selected from the group consisting of GBS59, GBS 67, GBS 150, 01521, 01523, 01524, and fragments thereof, andwherein the first and the second GBS AI polypeptide are not the samepolypeptide.

81. The immunogenic composition of embodiment 76 or 77 wherein thesecond GBS AI polypeptide is selected from the group consisting of GBS80, GBS 104, GBS 52, and fragments thereof, and wherein the first andthe second GBS AI polypeptide are not the same polypeptide.

82. The immunogenic composition of any one of embodiments 69-77 whereinthe first GBS AI polypeptide comprises a sortase substrate motif.

83. The immunogenic composition of embodiment 82 wherein the sortasesubstrate motif is an LPXTG motif.

84. The immunogenic composition of embodiment 83 wherein the LPXTG motifis represented by the sequence XPXTG, wherein the X at amino acidposition 1 is an L, an I, or an F and the X at amino acid position 3 isany amino acid residue.

85. The immunogenic composition of any one of embodiments 69-77 whereinthe first GBS AI polypeptide affects the ability of GBS bacteria toadhere to epithelial cells.

86. The immunogenic composition of any one of embodiments 69-77 whereinthe first GBS AI polypeptide affects the ability of GBS bacteria toinvade epithelial cells.

87. The immunogenic composition of any one of embodiments 69-77 whereinthe first GBS AI polypeptide affects the ability of GBS bacteria totranslocate through an epithelial cell layer.

88. The immunogenic composition of any one of embodiments 69-77 whereinthe first GBS AI polypeptide is capable of associating with anepithelial cell surface.

89. The immunogenic composition of embodiment 88 wherein the associatingwith an epithelial cell surface is binding to the epithelial cellsurface.

90. The immunogenic composition of any of embodiments 69-77 wherein thefirst GBS AI polypeptide is a full-length GBS AI protein.

91. The immunogenic composition of any of embodiments 69-77 wherein thefirst GBS AI polypeptide is a fragment of a full-length GBS AI protein.

92. The immunogenic composition of embodiment 91 wherein the fragmentcomprises at least 7 contiguous amino acid residues of the first GBS AIprotein.

93. The immunogenic composition of any one of embodiments 69-79 whereinthe first GBS AI polypeptide is in oligomeric form.

94. The immunogenic composition of any one of embodiments 69-77 whereinthe second GBS AI polypeptide is in oligomeric form.

95. The immunogenic composition of any one of embodiments 69-79 whereinthe first and the second GBS AI polypeptide are associated in a singleoligomeric form.

96. The immunogenic composition of embodiment 95 wherein the first andthe second GBS AI polypeptides are chemically associated.

97. The immunogenic composition of embodiment 95 wherein the first andthe second GBS AI polypeptides are physically associated.

98. The immunogenic composition of embodiment 93 wherein the oligomericform is a hyperoligomer.

99. The immunogenic composition of embodiment 94 wherein the oligomericform is a hyperoligomer.

100. The immunogenic composition of embodiment 76 wherein the first GBSAI polypeptide is GBS 80 and the second GBS AI polypeptide is GBS 104.

101. The immunogenic composition of embodiment 74 wherein the first GBSAI polypeptide is GBS 80 and the second GBS AI polypeptide is GBS 67.

102. The immunogenic composition of any one of embodiments 69-79, 100,or 101 further comprising a GBS polypeptide not associated with an AI.

103. The immunogenic composition of embodiment 102 wherein the GBSpolypeptide not associated with an AI is selected from the groupconsisting of GBS 322 and GBS 276.

104. The immunogenic composition of embodiment 103 wherein the GBSpolypeptide not associated with an AI is GBS 322.

105. An immunogenic composition comprising a first and a second Grampositive bacteria adhesin island (AI) polypeptide.

106. The immunogenic composition of embodiment 105 wherein a full lengthpolynucleotide sequence encoding for the first Gram positive bacteria AIpolypeptide is not present in a genome of a Gram positive bacteriacomprising a full length polynucleotide sequence encoding for the secondGram positive bacteria AI polypeptide.

107. The immunogenic composition of embodiment 105 whereinpolynucleotides encoding the first and the second Gram positive bacteriaAI polypeptide are each present in genomes of more than one Grampositive bacteria serotype and strain isolate.

108. The immunogenic composition of embodiment 105 wherein the first andthe second Gram positive bacteria AI polypeptides are of different Grampositive bacteria species.

109. The immunogenic composition of embodiment 105 wherein the first andthe second Gram positive bacteria AI polypeptides are of the same Grampositive bacteria species.

110. The immunogenic composition of embodiment 105 wherein the first andthe second Gram positive bacteria AI polypeptides are from different AIsubtypes.

111. The immunogenic composition of embodiment 105 wherein the first andthe second Gram positive bacteria AI polypeptides are from the same AIsubtype.

112. The immunogenic composition of embodiment 105 wherein the firstGram positive bacteria AI polypeptide has detectable surface exposure ona first Gram positive bacteria strain or serotype but not a second Grampositive bacteria strain or subtype and the second Gram positivebacteria AI polypeptide has detectable surface exposure on the secondGram positive bacteria strain or serotype but not the first Grampositive bacteria strain or serotype.

113. The immunogenic composition of embodiment 105 wherein the Grampositive bacteria is S. pneumonaie, S. mutans, E. faecalis, E. faecium,C. difficile, L. monocytogenes, or C. diphtheriae.

114. The immunogenic composition of any of embodiments 105-113 whereinthe first and the second Gram positive bacteria AI polypeptides comprisea sortase substrate motif.

115. The immunogenic composition of embodiment 114 wherein the sortasesubstrate motif is an LPXTG motif.

116. The immunogenic composition of embodiment 115 wherein the LPXTGmotif is represented by XXXXG, wherein the X at amino acid position 1 isan L, a V, an E, an I, an F, or a Q, wherein X at amino acid position 2is a P if X at amino acid position 1 is an L, an I, or an F, wherein Xat amino acid position 2 is a V if X at amino acid position 1 is a E ora Q, wherein X at amino acid position 2 is a V or a P if X at amino acidposition 1 is a V, wherein X at amino acid position 3 is any amino acidresidue, wherein X at amino acid position 4 is a T if X at amino acidposition 1 is a V, E, I, F, or Q, and wherein X at amino acid position 4is a T, S, or A if X at amino acid position 1 is an L.

117. The immunogenic composition of embodiment 105 wherein the firstGram positive bacteria AI polypeptide is a first Group A Streptococcus(GAS) AI polypeptide.

118. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide comprises a sortase substrate motif.

119. The immunogenic composition of embodiment 118 wherein the sortasesubstrate motif is an LPXTG motif.

120. The immunogenic composition of embodiment 119 wherein the LPXTGmotif is represented by XXXXG, wherein the X at the first amino acidposition is an L, a V, an E, or a Q, wherein the X at the second aminoacid position is P if the X at the first amino acid position is an L,the X at the second amino acid position is a V if the X at the firstamino acid position is an E or a Q, or the X at the second amino acidposition is a V or a P if the X at the first amino acid position is a V,wherein the X at the third amino acid position is any amino acidresidue, and wherein the X at the fourth amino acid position is a T ifthe X at the first amino acid position is a V, an E, or a Q, or the X atthe fourth amino acid position is a T, an S, or an A if the X at thefirst amino acid position is an L.

121. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide affects the ability of GAS bacteria to adhere toepithelial cells.

122. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide affects the ability of GAS bacteria to invade epithelialcells.

123. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide affects the ability of GAS bacteria to translocatethrough an epithelial cell layer.

124. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide is capable of associating with an epithelial cellsurface.

125. The immunogenic composition of embodiment 117 wherein theassociating with an epithelial cell surface is binding to the epithelialcell surface.

126. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide is a full-length GAS AI protein.

127. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide is a fragment of a full-length GAS AI protein.

128. The immunogenic composition of embodiment 127 wherein the fragmentcomprises at least 7 contiguous amino acid residues of the GAS AIprotein.

129. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide is a first GAS AI-1 polypeptide.

130. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide is a first GAS AI-2 polypeptide.

131. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide is a first GAS AI-3 polypeptide.

132. The immunogenic composition of embodiment 117 wherein the first GASAI polypeptide is a first GAS AI-4 polypeptide.

133. The immunogenic composition of any one of embodiments 117 or129-132 wherein the second Gram positive bacteria AI polypeptide is asecond GAS AI polypeptide.

134. The immunogenic composition of embodiment 133 wherein the secondGAS AI polypeptide is a second GAS AI-1 polypeptide.

135. The immunogenic composition of embodiment 133 wherein the secondGAS AI polypeptide is a second GAS AI-2 polypeptide.

136. The immunogenic composition of embodiment 133 wherein the secondGAS AI polypeptide is a second GAS AI-3 polypeptide.

137. The immunogenic composition of embodiment 133 wherein the secondGAS AI polypeptide is a second GAS AI-4 polypeptide.

138. The immunogenic composition of embodiment 129 wherein the first GASAI-1 polypeptide is selected from the group consisting of M6_Spy0157,M6_Spy0159, M6_Spy0160, CDC SS 410_fimbrial, ISS3650_fimbrial,DSM2071_fimbrial, and fragments thereof.

139. The immunogenic composition of embodiment 130 wherein the first GASAI-2 polypeptide is selected from the group consisting of GAS 15, GAS16, GAS 18, and fragments thereof.

140. The immunogenic composition of embodiment 131 wherein the first GASAI-3 polypeptide is selected from the group consisting of SpyM3_(—)0098,SpyM3_(—)0100, SpyM3_(—)0102, SpyM3_(—)0104, SPs0100, SPs0102, SPs0104,SPs0106, orf78, orf80, orf82, orf84, spyM18_(—)0126, spyM18_(—)0128,spyM18_(—)0130, spyM18_(—)0132, SpyoM01000156, SpyoM01000155,SpyoM01000154, SpyoM01000153, SpyoM01000152, SpyoM01000151,SpyoM01000150, SpyoM01000149, ISS3040_fimbrial, ISS3776_fimbrial,ISS4959_fimbrial, and fragments thereof.

141. The immunogenic composition of embodiment 132 wherein the first GASAI-4 polypeptide is selected from the group consisting of 19224134,19224135, 19224137, 19224139, 19224141, 20010296_fimbrial,20020069_fimbrial, CDC SS 635_fimbrial, ISS4883_fimbrial,ISS4538_fimbrial, and fragments thereof.

142. The immunogenic composition of embodiment 134 wherein the secondGAS AI-1 polypeptide is selected from the group consisting ofM6_Spy0157, M6_Spy0159, M6_Spy0160, CDC SS 410_fimbrial,ISS3650_fimbrial, DSM2071_fimbrial, and fragments thereof.

143. The immunogenic composition of embodiment 135 wherein the secondGAS AI-2 polypeptide is selected from the group consisting of GAS15,GAS16, GAS18, and fragments thereof.

144. The immunogenic composition of embodiment 136 wherein the secondGAS AI-3 polypeptide is selected from the group consisting ofSpyM3_(—)0098, SpyM3_(—)0100, SpyM3_(—)0102, SpyM3_(—)0104, SPs0100,SPs0102, SPs0104, SPs0106, orf78, orf80, orf82, orf84, spyM18_(—)0126,spyM18_(—)0128, spyM18_(—)0130, spyM18_(—)0132, SpyoM01000156,SpyoM01000155, SpyoM01000154, SpyoM01000153, SpyoM01000152,SpyoM01000151, SpyoM01000150, SpyoM01000149, ISS3040_fimbrial,ISS3776_fimbrial, ISS4959_fimbrial, and fragments thereof.

145. The immunogenic composition of embodiment 137 wherein the secondGAS AI-4 polypeptide is selected from the group consisting of 19224134,19224135, 19224137, 19224139, 19224141, 20010296_fimbrial,20020069_fimbrial, CDC SS 635_fimbrial, ISS4883_fimbrial,ISS4538_fimbrial, and fragments thereof.

146. The immunogenic composition of any one of embodiments 117-132 or138-141 wherein the second Gram positive bacteria AI polypeptide is aGroup B Streptococcus (GBS) AI polypeptide.

147. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide comprises a sortase substrate motif.

148. The immunogenic composition of embodiment 147 wherein the sortasesubstrate motif is an LPXTG motif.

149. The immunogenic composition of embodiment 148 wherein the LPXTGmotif is represented by the amino acid sequence XPXTG, wherein the X atamino acid position 1 is an L, an I, or an F and the X at amino acidposition 3 is any amino acid residue.

150. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide affects the ability of GBS bacteria to adhere to epithelialcells.

151. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide affects the ability of GBS bacteria to invade epithelialcells.

152. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide affects the ability of GBS bacteria to translocate throughan epithelial cell layer.

153. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide is capable of associating with an epithelial cell surface.

154. The immunogenic composition of embodiment 146 wherein theassociating with an epithelial cell surface is binding to the epithelialcell surface.

155. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide is a full-length GBS AI protein.

156. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide is a fragment of a full-length GBS AI protein.

157. The immunogenic composition of embodiment 156 wherein the fragmentcomprises at least 7 contiguous amino acid residues of the GBS AIprotein.

158. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide is a GBS AI-1 polypeptide.

159. The immunogenic composition of embodiment 146 wherein the GBS AIpolypeptide is a GBS AI-2 polypeptide.

160. The immunogenic composition of embodiment 158 wherein the GBS AI-1polypeptide is selected from the group consisting of GBS 80, GBS 104,GBS 52, and fragments thereof.

161. The immunogenic composition of embodiment 159 wherein the GBS AI-2polypeptide is selected from the group consisting of GBS 59, GBS 67, GBS150, 01521, 01523, 01524, and fragments thereof.

162. The immunogenic composition of any one of embodiments 117-132 or138-141 wherein the second Gram positive bacteria AI polypeptide is aStreptococcus pneumoniae AI polypeptide.

163. The immunogenic composition of embodiment 162 wherein the S.pneumoniae AI polypeptide comprises a sortase substrate motif.

164. The immunogenic composition of embodiment 163 wherein the sortasesubstrate motif is an LPXTG motif.

165. The immunogenic composition of embodiment 162 wherein the S.pneumoniae AI polypeptide affects the ability of S. pneumoniae to adhereto epithelial cells.

166. The immunogenic composition of embodiment 162 S. pneumoniae AIpolypeptide affects the ability of S. pneumoniae to invade epithelialcells.

167. The immunogenic composition of embodiment 162 wherein the S.pneumoniae AI polypeptide affects the ability of S. pneumoniae totranslocate through an epithelial cell layer.

168. The immunogenic composition of embodiment 162 wherein the S.pneumoniae AI polypeptide is capable of associating with an epithelialcell surface.

169. The immunogenic composition of embodiment 168 wherein theassociating with an epithelial cell surface is binding to the epithelialcell surface.

170. The immunogenic composition of embodiment 162 wherein the S.pneumoniae AI polypeptide is a full-length S. pneumoniae AI protein.

171. The immunogenic composition of embodiment 162 wherein the S.pneumoniae AI polypeptide is a fragment of a full-length S. pneumoniaeAI protein.

172. The immunogenic composition of embodiment 162 wherein the fragmentcomprises at least 7 contiguous amino acid residues of the S. pneumoniaeAI protein.

173. The immunogenic composition of embodiment 162 wherein the S.pneumoniae AI polypeptide is selected from the group consisting ofSP0462, SP0463, SP0464, orf3_(—)670, orf4_(—)670, orf5_(—)670,ORF3_(—)14CSR, ORF4_(—)14CSR, ORF5_(—)14CSR, ORF3_(—)19AH, ORF4_(—)19AH,ORF5_(—)19AH, ORF3_(—)19FTW, ORF4_(—)19FTW, ORF5_(—)19FTW, ORF3_(—)23FP,ORF4_(—)23FP, ORF5_(—)23FP, ORF3_(—)23FTW, ORF4_(—)23FTW, ORF5_(—)23FTW,ORF3_(—)6BF, ORF4_(—)6BF, ORF5_(—)6BF, ORF3_(—)6BSP, ORF4_(—)6BSP,ORF5_(—)6BSP, ORF3_(—)9VSP, ORF4_(—)9VSP, ORF5_(—)9VSP, and fragmentsthereof.

174. The immunogenic composition of any one of embodiments 105-117wherein the first Gram positive bacteria AI polypeptide is in oligomericform.

175. The immunogenic composition of embodiment 174 wherein theoligomeric form is a hyperoligomer.

176. The immunogenic composition of embodiment 174 wherein the secondGram positive bacteria AI polypeptide is in oligomeric form.

177. The immunogenic composition of embodiment 176 wherein theoligomeric form is a hyperoligomer.

178. The immunogenic composition of embodiment 176 wherein the first andthe second Gram positive bacteria AI polypeptide are associated in asingle oligomeric form.

179. The immunogenic composition of embodiment 178 wherein the first andthe second Gram positive bacteria AI polypeptide are chemicallyassociated.

180. The immunogenic composition of embodiment 178 wherein the first andthe second Gram positive bacteria AI polypeptide are physicallyassociated.

181. The immunogenic composition of any one of embodiments 105-117further comprising a Gram positive bacteria polypeptide not associatedwith an AI.

182. The immunogenic composition of embodiment 181 wherein the Grampositive bacteria polypeptide not associated with an AI is selected fromthe group consisting of GBS 322 and GBS 276.

183. The immunogenic composition of embodiment 182 wherein the Grampositive bacteria polypeptide not associated with an AI is GBS 322.

184. A modified Gram positive bacterium adapted to produce increasedlevels of AI surface protein.

185. The modified Gram positive bacterium of embodiment 184 wherein theAI surface protein is in oligomeric form.

186. The modified Gram positive bacterium of embodiment 185 wherein theoligomeric form is a hyperoligomer.

187. The modified Gram positive bacterium of any one of embodiments184-186 which is a Group B Streptococcus bacterium.

188. The modified Gram positive bacterium of any one of embodiments184-186 which is a Group A Streptococcus bacterium.

189. The modified Gram positive bacterium of any one of embodiments184-186 which is a non-pathogenic Gram positive bacterium.

190. The modified Gram positive bacterium of embodiment 189 wherein thenon-pathogenic Gram positive bacterium is Streptococus gordonii.

191. The modified Gram positive bacterium of embodiment 189 wherein thenon-pathogenic Gram positive bacterium is Lactococcus lactis.

192. The modified Gram positive bacterium of any one of embodiments184-186 which has been inactivated and wherein the AI surface protein isexposed on the surface of the Gram positive bacterium.

193. The modified Gram positive bacterium of any one of embodiments184-186 which has been attenuated and wherein the AI surface protein isexposed on the surface of the Gram positive bacterium.

194. The modified GBS bacterium of embodiment 187 which has beeninactivated and wherein the AI surface protein is exposed on the surfaceof the GBS bacterium.

195. The modified GBS bacterium of embodiment 187 which has beenattenuated and wherein the AI surface protein is exposed on the surfaceof the GBS bacterium.

196. The modified GAS bacterium of embodiment 188 which has beeninactivated and wherein the AI surface protein is exposed on the surfaceof the GAS bacterium.

197. The modified GAS bacterium of embodiment 188 which has beenattenuated and wherein the AI surface protein is exposed on the surfaceof the GAS bacterium.

198. The modified non-pathogenic bacterium of embodiment 189 which hasbeen inactivated and wherein the AI surface protein is exposed on thesurface of the non-pathogenic Gram positive bacterium.

199. The modified non-pathogenic bacterium of embodiment 189 which hasbeen attenuated and wherein the AI surface protein is exposed on thesurface of the non-pathogenic Gram positive bacterium.

200. A method for manufacturing an oligomeric adhesin island (AI)surface antigen comprising:

culturing a Gram positive bacterium that expresses an oligomeric AIsurface antigen and

isolating the expressed oligomeric AI surface antigen.

201. The method of embodiment 200 wherein the step of isolating isperformed by collecting said oligomeric AI surface antigen from Grampositive bacterium secretions in the Gram positive bacterium culture.

202. The method of embodiment 200 further comprising a step ofpurifying.

203. The method of embodiment 202 wherein the oligomeric AI surfaceantigen is purified from the Gram positive bacterium cell surface.

204. The method of embodiment 200 wherein the Gram positive bacterium isadapted for increased AI protein expression.

205. The method of any one of embodiments 200-204 wherein the Grampositive bacterium is a Group A Streptococcus bacterium.

206. The method of any one of embodiments 200-204 wherein the Grampositive bacterium is a Group B Streptococcus bacterium.

207. The method of any one of embodiments 200-204 wherein the oligomericAI surface antigen is in hyperoligomeric form.

208. The method of embodiment 200 wherein the Gram positive bacteriumexpresses the oligomeric AI surface antigen recombinantly.

209. The method of embodiment 208 wherein the Gram positive bacteriumfurther manipulated expresses at least 1 AI sortase.

210. The modified Gram positive bacterium of any one of embodiments184-186 which is a S. pneumoniae bacterium.

211. The method of any one of embodiments 200-204 wherein the Grampositive bacterium is S. pneumoniae.

1. An immunogenic composition comprising a purified Group BStreptococcus (GBS) adhesin island (AI) polypeptide in oligomeric form.2. The immunogenic composition of claim 1 wherein the GBS AI polypeptideis selected from a GBS AI-1.
 3. The immunogenic composition of claim 1wherein the GBS AI polypeptide is selected from a GBS AI-2.
 4. Theimmunogenic composition of claim 2 wherein the GBS AI polypeptide isselected from the group consisting of GBS 80, GBS 104, GBS 52, andfragments thereof.
 5. The immunogenic composition of claim 3 wherein theGBS AI polypeptide is selected from the group consisting of GBS 59, GBS67, GBS 150, 01521, 01523, 01524, and fragments thereof.
 6. Theimmunogenic composition of claim 4 wherein the GBS AI polypeptide is GBS80.
 7. The immunogenic composition of any of claims 1-6 wherein theoligomeric form is a hyperoligomer. 8 (22). An immunogenic compositioncomprising a purified Gram positive bacteria adhesin island (AI)polypeptide in an oligomeric form. 9 (23). The immunogenic compositionof claim 8 wherein the Gram positive bacteria is of a genus selectedfrom the group consisting of Streptococcus, Enterococcus,Staphylococcus, Clostridium, Corynebacterium, or Listeria. 10 (24). Theimmunogenic composition of claim 9 wherein the Gram positive bacteria isof the genus Streptococcus. 11 (35). The immunogenic composition ofclaim 10 wherein the genus Streptococcus bacteria is Group AStreptococcus (GAS) bacteria and the Gram positive bacteria AIpolypeptide is a GAS AI polypeptide. 12 (36). The immunogeniccomposition of claim 11 wherein the GAS AI polypeptide is selected froma GAS AI-1. 13 (37). The immunogenic composition of claim 11 wherein theGAS AI polypeptide is selected from a GAS AI-2. 14 (38). The immunogeniccomposition of claim 11 wherein the GAS AI polypeptide is selected froma GAS AI-3. 15 (39). The immunogenic composition of claim 11 wherein theGAS AI polypeptide is selected from a GAS AI-4. 16 (66). The immunogeniccomposition of any one of claims 8-15 wherein the oligomeric form is ahyperoligomer.
 17. An immunogenic composition comprising a first and asecond Group B Streptococcus (GBS) adhesin island (AI) polypeptide. 18.The immunogenic composition of claim 17 wherein the first GBS AIpolypeptide is encoded by a GBS AI-1.
 19. The immunogenic composition ofclaim 18 wherein the second GBS AI polypeptide is encoded by a GBS AI-2.20. The immunogenic composition of claim 18 wherein the first GBS AIpolypeptide is selected from the group consisting of GBS 80, GBS 104,GBS 52, and fragments thereof.
 21. The immunogenic composition of claim19 wherein the second GBS AI polypeptide is selected from the groupconsisting of GBS 59, GBS 67, GBS 150, 01521, 01523, 01524, andfragments thereof, and wherein the first and the second GBS AIpolypeptide are not the same polypeptide.
 22. The immunogeniccomposition of claim 19 wherein the first GBS AI polypeptide is GBS 80and the second GBS AI polypeptide is GBS
 67. 23. An immunogeniccomposition comprising a first and a second Gram positive bacteriaadhesin island (AI) polypeptide.
 24. The immunogenic composition ofclaim 23 wherein the Gram positive bacteria is Streptococcus,Enterococcus, Staphylococcus, Clostridium, Corynebacterium, or Listeria.25. The immunogenic composition of claim 23 wherein the first Grampositive bacteria AI polypeptide is a first Group A Streptococcus (GAS)AI polypeptide.
 26. The immunogenic composition of claim 25 wherein thefirst GAS AI polypeptide is a first GAS AI-1 polypeptide.
 27. Theimmunogenic composition of claim 25 wherein the first GAS AI polypeptideis a first GAS AI-2 polypeptide.
 28. The immunogenic composition ofclaim 25 wherein the first GAS AI polypeptide is a first GAS AI-3polypeptide.
 29. The immunogenic composition of claim 25 wherein thefirst GAS AI polypeptide is a first GAS AI-4 polypeptide.
 30. Theimmunogenic composition of any one of claims 25-29 wherein the secondGram positive bacteria AI polypeptide is a second GAS AI polypeptide.31. The immunogenic composition of claim 30 wherein the second GAS AIpolypeptide is a second GAS AI-1 polypeptide.
 32. The immunogeniccomposition of claim 30 wherein the second GAS AI polypeptide is asecond GAS AI-2 polypeptide.
 33. The immunogenic composition of claim 30wherein the second GAS AI polypeptide is a second GAS AI-3 polypeptide.34. The immunogenic composition of claim 30 wherein the second GAS AIpolypeptide is a second GAS AI-4 polypeptide.
 35. A modified Grampositive bacterium adapted to produce increased levels of AI surfaceprotein.
 36. The modified Gram positive bacterium of claim 35 whereinthe AI surface protein is in oligomeric form.
 37. The modified Grampositive bacterium of claim 36 wherein the oligomeric form is ahyperoligomer.
 38. The modified Gram positive bacterium of any one ofclaims 35-37 which is a non-pathogenic Gram positive bacterium.
 39. Themodified Gram positive bacterium of claim 38 wherein the non-pathogenicGram positive bacterium is Lactococcus lactis.
 40. A method formanufacturing an oligomeric adhesin island (AI) surface antigencomprising: culturing a Gram positive bacterium that expresses anoligomeric AI surface antigen and isolating the expressed oligomeric AIsurface antigen.